<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Duarte O.Carmo</title><link href="https://duarteocarmo.com/" rel="alternate"/><link href="https://duarteocarmo.com/feed.xml" rel="self"/><id>https://duarteocarmo.com/</id><updated>2026-04-03T00:00:00+02:00</updated><entry><title>Retrospectiva #6</title><link href="https://duarteocarmo.com/blog/retrospectiva-6.html" rel="alternate"/><published>2026-04-03T00:00:00+02:00</published><updated>2026-04-03T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-04-03:/blog/retrospectiva-6.html</id><summary type="html">&lt;p&gt;&lt;img alt="Ponte 25 de Abril over the Tagus river" src="https://duarteocarmo.com/images/100/ponte-25-de-abril.webp" /&gt;&lt;/p&gt;
&lt;p&gt;Well, now would you look at that. Article 100 on this website. Ten years of writing on this small corner of the web. I'm writing this month's newsletter from my favourite place: the airplane.&lt;/p&gt;
&lt;p&gt;Happy Easter if you celebrate. With clients spread across time zones, I didn't really get the …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img alt="Ponte 25 de Abril over the Tagus river" src="https://duarteocarmo.com/images/100/ponte-25-de-abril.webp" /&gt;&lt;/p&gt;
&lt;p&gt;Well, now would you look at that. Article 100 on this website. Ten years of writing on this small corner of the web. I'm writing this month's newsletter from my favourite place: the airplane.&lt;/p&gt;
&lt;p&gt;Happy Easter if you celebrate. With clients spread across time zones, I didn't really get the time. Next week I'll be at &lt;a href="https://www.ai.engineer/london" target="_blank"&gt;AI Engineer London&lt;/a&gt;. Email me if you're coming too!&lt;/p&gt;
&lt;p&gt;I'm experimenting with a &lt;em&gt;slightly&lt;/em&gt; different format this week.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://shittycodingagent.ai/" target="_blank"&gt;Pi.dev&lt;/a&gt;&lt;/strong&gt;: After watching &lt;a href="https://www.youtube.com/watch?v=0RLIlNWv1xo"&gt;this talk&lt;/a&gt; on how some people use Pi, I decided to (re-)double down. It's become my main driver. I've created a nice collection of &lt;a href="https://github.com/duarteocarmo/dotfiles/tree/master/.agents/skills"&gt;skills&lt;/a&gt; (some private), &lt;a href="https://github.com/duarteocarmo/dotfiles/tree/master/.pi/agent/extensions"&gt;extensions&lt;/a&gt;, and even ported my favorite &lt;a href="https://github.com/duarteocarmo/dotfiles/tree/master/.pi/agent/themes"&gt;theme&lt;/a&gt; to it. An &lt;a href="/blog/how-to-police-your-agents.html"&gt;agent harness&lt;/a&gt; is a bit like an editor: it needs to be &lt;em&gt;yours&lt;/em&gt;. I want to &lt;code&gt;/preview&lt;/code&gt; a longer agent plan in a nice web page. I want to answer questions in a nice interface with &lt;code&gt;/answer&lt;/code&gt;, I want the theme to adapt to my macOS system theme. And, most of all, I want to be able to use it with ANY model. Even &lt;a href="https://github.com/duarteocarmo/dotfiles/blob/master/.pi/agent/models.json#L9"&gt;local ones&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://developers.openai.com/codex/app/" target="_blank"&gt;Codex&lt;/a&gt;&lt;/strong&gt;: Some things need my full attention. Others don't. When it's a quick fix, a breaking test, a simpler analysis, or something I can run in parallel, Codex has been my go-to. You can preview diffs, open in a terminal, create PRs, all from the same place. I tend to throw at it the smaller parallel things that are not "high risk".&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.hevyapp.com/" target="_blank"&gt;Hevy&lt;/a&gt;&lt;/strong&gt;: I &lt;a href="/blog/an-opinionated-running-dashboard.html"&gt;continue to run&lt;/a&gt;. Not as much as I'd like. As you get older, weight lifting and strength training become increasingly important. After freestyling my gym routine for the past years, I wanted something a bit more sophisticated. After testing 3-4 apps, I landed on Hevy. It's paid but not expensive. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://apps.apple.com/us/app/snapseed/id439438619" target="_blank"&gt;Snapseed&lt;/a&gt;&lt;/strong&gt;: My inspiration to take &lt;a href="/photos"&gt;photos&lt;/a&gt; comes and goes. But I need an editing app that makes the ugly Danish sky just a tad prettier. The previous version of Snapseed looked largely abandoned so I resorted to &lt;a href="https://darkroom.co/"&gt;Darkroom&lt;/a&gt;. Now that everything is a subscription everywhere I started looking again. The new Snapseed app is excellent. &lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.goodreads.com/book/show/222725518-empire-of-ai" target="_blank"&gt;Empire of AI - Karen Hao&lt;/a&gt;&lt;/strong&gt;: I tested a new reading routine this month: One chapter a day. Ended up finishing this one. The book has some interesting tidbits about the history of OpenAI and how it all came to be. It also has some slightly deeper commentary on the ethics of AI models and infrastructure build outs. Unfortunately, it had a bit too much &lt;em&gt;TMZ-style&lt;/em&gt; tech drama I don't really find interesting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.baseten.co/inference-engineering/" target="_blank"&gt;Inference Engineering - Philip Kiely&lt;/a&gt;&lt;/strong&gt;: I picked this one up on my &lt;a href="/blog/goodbye-kindle-i-dont-think-ill-miss-you.html"&gt;Palma&lt;/a&gt;. Even though I just started, I can already recommend it. As I work with larger models, datasets, and infrastructure - it's a great field guide.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"If the cluster is not on fire, you are not using it enough" &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://emerge-lab.github.io/papers/an-unsolicited-guide-to-good-research.pdf" target="_blank"&gt;An unsolicited guide to good research - Eugene Vinitsky&lt;/a&gt;&lt;/strong&gt;: A great deck on how to do great ML research, and how to find interesting problems. I liked the framing: pragmatic and to the point. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.philschmid.de/kimi-composer-context" target="_blank"&gt;How Kimi, Cursor, and Chrome Train Agentic Models with RL - Phil Schmid&lt;/a&gt;&lt;/strong&gt;: An incredible roundup of the reinforcement learning techniques some of the larger players are using to train models.&lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;&lt;br&gt;
&lt;lite-youtube videoid="1bsBBU_ETy8"&gt;&lt;/lite-youtube&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://bit.donado.co/1xAhBA" target="_blank"&gt;The Roving - Bonny Light Horseman&lt;/a&gt;&lt;/strong&gt;: What can I say? After clicking "don't recommend" on those Ruff songs on my Discover Weekly, sometimes Spotify eventually makes you discover something new and interesting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://soundcloud.com/intuition_radio/reverie" target="_blank"&gt;Reverie - Jule&lt;/a&gt;&lt;/strong&gt;: I managed to tone down meetings and put some nice focused stretches of work this month. And a good SoundCloud mix is essential to keep me company.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://podcasts.apple.com/us/podcast/confronting-the-ceo-of-the-ai-company-that-impersonated-me/id1011668648?i=1000756732024" target="_blank"&gt;Confronting the CEO of the AI Company that impersonated me - Decoder&lt;/a&gt;&lt;/strong&gt;: Long-term fan of the Vergecast. I love Nilay's interview style. I do think Shishir held his ground pretty well. An interesting discussion, if we set aside all the drama.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://podcasts.apple.com/us/podcast/the-handyman-of-high-art-tom-sachs-on-why-creativity/id582272991?i=1000752476132" target="_blank"&gt;The Handyman of High Art - Rich Roll&lt;/a&gt;&lt;/strong&gt;: I'm a big fan of Tom Sachs. The NY-based artist that inspired much of Casey Neistat's brutalist aesthetic. And I'm a sucker for brutalism. An interesting look at his life and work.&lt;/p&gt;
&lt;h2 id="watching"&gt;Watching&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.youtube.com/watch?v=YwZR6tc7qYg" target="_blank"&gt;Everything We Got Wrong About Research-Plan-Implement&lt;/a&gt;&lt;/strong&gt;: Loved this one. On how to tackle codebases with agents. Some interesting mentions of: &lt;em&gt;"Hey - we let the agent rip, and actually regretted it. Then we went back, and thought about how to do it properly"&lt;/em&gt;. Agents are likely to eat software, but we are still learning how to deal with it.&lt;/p&gt;
&lt;p&gt;&lt;br&gt;
&lt;lite-youtube videoid="Pdp3p23P-TI"&gt;&lt;/lite-youtube&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.youtube.com/watch?v=Pdp3p23P-TI" target="_blank"&gt;Living in the Matrix - Casey Neistat&lt;/a&gt;&lt;/strong&gt;: From the master, extremely relatable. I also feel like chasing the green quadrant every time. But maybe the other 3 are a necessary evil.&lt;/p&gt;
&lt;p&gt;The flight attendant is about to come by and scream at me for not closing my laptop. See you next month!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>An opinionated running dashboard</title><link href="https://duarteocarmo.com/blog/an-opinionated-running-dashboard.html" rel="alternate"/><published>2026-03-25T00:00:00+01:00</published><updated>2026-03-25T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-03-25:/blog/an-opinionated-running-dashboard.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://tandarunner.duarteocarmo.com/"&gt;
&lt;img src="https://duarteocarmo.com/images/99/calendar.webp" alt="Tanda Runner calendar view" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;As you get older, life becomes complicated. Not in a bad way. There’s just more going on. We don’t all want to run marathons. Maybe you want to run a &lt;a href="https://www.parkrun.com/"&gt;parkrun&lt;/a&gt;. Maybe you want to gradually increase your volume. Maybe you don't want to run at all. Whatever …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://tandarunner.duarteocarmo.com/"&gt;
&lt;img src="https://duarteocarmo.com/images/99/calendar.webp" alt="Tanda Runner calendar view" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;As you get older, life becomes complicated. Not in a bad way. There’s just more going on. We don’t all want to run marathons. Maybe you want to run a &lt;a href="https://www.parkrun.com/"&gt;parkrun&lt;/a&gt;. Maybe you want to gradually increase your volume. Maybe you don't want to run at all. Whatever your running goal is, you &lt;em&gt;should&lt;/em&gt; be able to plan for it.&lt;/p&gt;
&lt;p&gt;In the new version of &lt;a href="https://tandarunner.duarteocarmo.com/"&gt;Tanda Runner&lt;/a&gt; you can plan for &lt;em&gt;whatever&lt;/em&gt; your running goal is. I redesigned all the parts I wasn't a fan of. The result is a redesigned chat interface with an agent that knows about all your runs, and, my favourite: a fully automated - but personalized - running plan generator.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;a href="https://tandarunner.duarteocarmo.com/"&gt;
&lt;img src="https://duarteocarmo.com/images/99/chat_interface.webp" alt="Tanda Runner chat interface" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;The plan generator analyzes your past activity, understands your running habits, and builds a completely customized running plan just for you. You can take this and export it to your calendar app of choice.&lt;/p&gt;
&lt;p&gt;The technology choices evolved a bit, but not too much. I’m still on my beloved &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; + &lt;a href="https://htmx.org/"&gt;htmx&lt;/a&gt; stack. The LLM models are served through &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt;, and tracing is sent automatically to &lt;a href="https://openrouter.ai/docs/guides/features/broadcast/weave"&gt;W&amp;amp;B Weave&lt;/a&gt;. The agent itself is built with &lt;a href="https://ai.pydantic.dev/"&gt;PydanticAI&lt;/a&gt;, a framework I’ve enjoyed using more and more. &lt;/p&gt;
&lt;p&gt;There aren’t enough niche running tools out there. This one is mine.&lt;/p&gt;
&lt;p&gt;See you out there, runner.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Portuguese variety identification: The bitter lesson</title><link href="https://duarteocarmo.com/blog/portuguese-variety-identification-the-bitter-lesson.html" rel="alternate"/><published>2026-03-09T00:00:00+01:00</published><updated>2026-03-09T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-03-09:/blog/portuguese-variety-identification-the-bitter-lesson.html</id><summary type="html">&lt;div class="iframe-container"&gt;
  &lt;iframe scrolling="no" id="results-frame" src="https://duarteocarmo.com/html/98/euptvid-results.html" title="FastText vs BERT results" loading="lazy"&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;Given some text in Portuguese, how easy is it to determine if it's from Brazil or Portugal&lt;sup id="sf-portuguese-variety-identification-the-bitter-lesson-1-back"&gt;&lt;a href="#sf-portuguese-variety-identification-the-bitter-lesson-1" class="simple-footnote" title="Or somewhere else!"&gt;1&lt;/a&gt;&lt;/sup&gt;? For native speakers, this is pretty easy – it's almost a feeling. But for machines: not so much. &lt;/p&gt;
&lt;p&gt;This might seem like a useless problem at first. But in the age of language …&lt;/p&gt;</summary><content type="html">&lt;div class="iframe-container"&gt;
  &lt;iframe scrolling="no" id="results-frame" src="https://duarteocarmo.com/html/98/euptvid-results.html" title="FastText vs BERT results" loading="lazy"&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;Given some text in Portuguese, how easy is it to determine if it's from Brazil or Portugal&lt;sup id="sf-portuguese-variety-identification-the-bitter-lesson-1-back"&gt;&lt;a href="#sf-portuguese-variety-identification-the-bitter-lesson-1" class="simple-footnote" title="Or somewhere else!"&gt;1&lt;/a&gt;&lt;/sup&gt;? For native speakers, this is pretty easy – it's almost a feeling. But for machines: not so much. &lt;/p&gt;
&lt;p&gt;This might seem like a useless problem at first. But in the age of language models and some of my research, I've seen this problem more than once.&lt;/p&gt;
&lt;p&gt;When I &lt;a href="/blog/a-benchmark-for-language-models-on-european-portuguese.html"&gt;contributed Portuguese datasets to EuroEval&lt;/a&gt;, I had a hard time finding exclusively European Portuguese data. Most of the Portuguese on the Internet is &lt;em&gt;obviously&lt;/em&gt; of the Brazilian variant. When I created &lt;a href="/blog/bagaco-a-pretraining-dataset-for-european-portuguese.html"&gt;Bagaço&lt;/a&gt;, a pretraining dataset for European Portuguese, the most obvious solution was to filter data for &lt;code&gt;.pt&lt;/code&gt; domains. This works as a proxy, but is far from ideal.&lt;/p&gt;
&lt;p&gt;It stuck in my mind: If we want to build good European Portuguese models we need robust ways of detecting that language variant in the wild.&lt;/p&gt;
&lt;p&gt;In &lt;em&gt;&lt;a href="https://arxiv.org/abs/2502.14394"&gt;Enhancing Portuguese Variety Identification with Cross-Domain Approaches&lt;/a&gt;&lt;/em&gt; the authors take a stab at the problem. They fine-tune &lt;a href="https://huggingface.co/neuralmind/bert-large-portuguese-cased"&gt;BERTimbau&lt;/a&gt; to classify text into PT-PT or PT-BR. The key idea: by hiding specific types of words (places, names, etc.) the model is forced to learn the structural/grammatical differences between the languages. It works well, but there's a problem: It doesn't scale!&lt;/p&gt;
&lt;p&gt;Running a BERT based model to filter pre-training data (at internet scale) could take &lt;em&gt;days&lt;/em&gt; even with dedicated hardware. I needed something as good as their model but that I could run fast for hundreds of Gigabytes of text. Most pre-training datasets use something like &lt;a href="https://fasttext.cc/"&gt;FastText&lt;/a&gt; or &lt;a href="https://github.com/cisnlp/GlotLID"&gt;variants&lt;/a&gt; to filter data by language. It scales really well&lt;sup id="sf-portuguese-variety-identification-the-bitter-lesson-2-back"&gt;&lt;a href="#sf-portuguese-variety-identification-the-bitter-lesson-2" class="simple-footnote" title="So well that forced Claude into building a Rust implementation for it."&gt;2&lt;/a&gt;&lt;/sup&gt; and doesn't need any type of dedicated hardware.&lt;/p&gt;
&lt;p&gt;After some &lt;a href="https://github.com/LIAAD/portuguese_vid/issues/3"&gt;hiccups&lt;/a&gt; (and a lot of tokens) I finally managed to reproduce the results from the paper. Once that was done, I started training a series of FastText models on &lt;a href="https://huggingface.co/datasets/liaad/PtBrVId"&gt;their same dataset&lt;/a&gt;. The results were &lt;em&gt;mediocre&lt;/em&gt; at best. Different configurations of the data, some oversampling, parameter tuning, nothing got close to the baseline performance (theirs).&lt;/p&gt;
&lt;p&gt;Then I thought: what if I just need more data? &lt;/p&gt;
&lt;p&gt;After some digging, I found &lt;a href="https://huggingface.co/datasets/bastao/VeraCruz_PT-BR"&gt;this dataset&lt;/a&gt; from &lt;a href="https://www.linkedin.com/in/fab-bastos/"&gt;Fabio Bastos&lt;/a&gt; - he basically did the exact domain level filtering as &lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco"&gt;Bagaço&lt;/a&gt;, but for a much larger dataset: &lt;a href="https://huggingface.co/datasets/uonlp/CulturaX"&gt;CulturaX&lt;/a&gt;. Separating it into Portuguese and Brazilian sources: exactly what I need to train a model. I started training classifiers, and the more data I trained on, the better performance I got. Once I trained on ~6M rows lo and behold: similar performance but &lt;strong&gt;10x&lt;/strong&gt; faster. &lt;/p&gt;
&lt;p&gt;Actually, the quantized version of the model I trained packs a whole lot of punch but in only 70MB! The quantized model is so small it runs in your browser:&lt;/p&gt;
&lt;div class="iframe-container"&gt;
  &lt;iframe scrolling="no" id="demo-frame" src="https://duarteocarmo.com/html/98/demo.html" title="FastText Portuguese variety classifier demo" loading="lazy"&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;As a sanity check, I ran the model on a subset of &lt;a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-2"&gt;FineWeb2&lt;/a&gt; that had been filtered for Portuguese. Here's a snippet of the output:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Loading model...
Downloading parquet...Loaded 33,846 rows
Classifying with threshold=0.7...
Classified 33,846 rows in 30.1s (1,125 rows/sec)
Score distribution:
mean=0.320  median=0.176
Total: 33,846 → PT-PT: 6,102 (18.0% kept)
Random PT-PT URLs:
- https://blogtailors.com/4997524.html
- https://www.bodyboardcenter.com/pt/apparel/734-sen-no-sen-not-diet-atoll-polo
- https://abapinho.com/2017/09/plantuml-finalmente-o-uml-da-para-usar/
- https://www.sabado.pt/portugal/detalhe/como-jose-veiga-denunciou-rui-rangel
- http://www.poadvogados.pt/Areas/InsolvRecoverE/?MOBILE=1
- https://www.viralagenda.com/pt/events/926945/palestra-intervir-com-criancas-e-jovens-em-risco-como-actuar
- http://confessionsfashiongirl.blogspot.com/2009/11/coisinhas-sem-as-quais-nao-podia-viver_08.html
- https://www.powrenism.com/forum/welcome-to-the-forum/exercice-dorsaux-sans-materiel-myogen-dianabol
- https://topbinamvf.web.app/berezny38567def/forex-seminbrio-manchester-1780.html
- https://foreveryoung.sapo.pt/rustic-chicken-ja-conhece-a-nova-proposta-da-mcdonalds/
- https://aralumiar.wordpress.com/2007/01/
- http://iclub.pt/video-do-4o-iclub-dinner/
- http://www.hotfrog.pt/empresa/lisboa/odivelas/avarias-ao-domicilio
- https://www.rowenta.pt/eficiencia-energetica-by-rowenta
- https://codimagem.com/2023/05/12/giros-gr%C3%A1tis-eletr%C3%B4nicos/
- http://apracas.pt/article/list/1/noticias/
- https://www.noticiasaominuto.com/politica/864278/cgtp-diz-que-greve-mostrou-que-autoeuropa-nao-vive-acima-da-lei
- http://nave-azul.blogspot.com/2010/11/portal-101110.html
- http://gbtrabalhoshubj.ischadia.info/mensagem-de-fernando-pessoa-terceira-parte-o-encoberto.html
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, lots of these would never pop up if we just filtered for &lt;code&gt;.pt&lt;/code&gt; domains. Also, the percentage kept aligns well at ~18% of text in the &lt;code&gt;pt&lt;/code&gt; split being European Portuguese.&lt;/p&gt;
&lt;p&gt;There's a parallel to LLMs in this post. Unlike the original paper, we didn't do any fancy delexicalization, we didn't even curate a pristine mix of data from different sources. We threw 6M rows of web data at the problem, and the problem got fixed. This is a known lesson in the world of computing: &lt;a href="https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf"&gt;it's called the Bitter Lesson&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;The model and code are available on &lt;a href="https://github.com/duarteocarmo/eupt_vid"&gt;GitHub&lt;/a&gt; and &lt;a href="https://huggingface.co/duarteocarmo/fasttext-euptvid"&gt;Hugging Face&lt;/a&gt;.&lt;/p&gt;
&lt;style&gt;
  .iframe-container {
    width: 100%;
    max-width: 800px;
    margin: 2rem auto;
    text-align: center;
  }

  #results-frame {
    width: 100%;
    height: 440px;
    border: none;
    display: block;
    box-sizing: border-box;
  }

  #demo-frame {
    width: 100%;
    height: 420px;
    border: none;
    display: block;
    box-sizing: border-box;
  }

  @media (max-width: 480px) {
    #results-frame { height: 300px; }
    #demo-frame { height: 380px; }
  }
&lt;/style&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-portuguese-variety-identification-the-bitter-lesson-1"&gt;Or &lt;a href="https://en.wikipedia.org/wiki/Portuguese-speaking_world"&gt;somewhere else&lt;/a&gt;! &lt;a href="#sf-portuguese-variety-identification-the-bitter-lesson-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-portuguese-variety-identification-the-bitter-lesson-2"&gt;So well that forced Claude into building a &lt;a href="https://github.com/duarteocarmo/fasttext.rs"&gt;Rust implementation&lt;/a&gt; for it. &lt;a href="#sf-portuguese-variety-identification-the-bitter-lesson-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>Retrospectiva #5</title><link href="https://duarteocarmo.com/blog/retrospectiva-5.html" rel="alternate"/><published>2026-03-01T00:00:00+01:00</published><updated>2026-03-01T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-03-01:/blog/retrospectiva-5.html</id><summary type="html">&lt;p&gt;Happy February! Just like that, we are close to wrapping up the first quarter of 2026. After quite some time roaming around, we finally flew back home to Copenhagen. It's cold, windy, and grey, but it's also calm, organized, and cozy. Most of all: it's &lt;em&gt;home&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;Another good thing about …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Happy February! Just like that, we are close to wrapping up the first quarter of 2026. After quite some time roaming around, we finally flew back home to Copenhagen. It's cold, windy, and grey, but it's also calm, organized, and cozy. Most of all: it's &lt;em&gt;home&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;Another good thing about Denmark is that it's one of the &lt;a href="https://www.instagram.com/p/C6oRHlVs4lK/"&gt;flattest&lt;/a&gt; countries in Europe. Perfect for increasing weekly running volume. It's cold, but it's almost time to ditch the running tights! Just a couple months left.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;We've been going with Allegra pretty much everywhere. From small walks, to airplanes, to high speed trains. We've been lucky - she's really chill, and sleeps most of the time. But one of our best purchases was the &lt;a href="https://ergobaby.com/baby-carrier/omni/the-omni-360-baby-carrier#color-Midnight%20Blue%20"&gt;Ergo Baby Omni 360&lt;/a&gt;. Vitto managed to find it second-hand at a nice price. But it's been absolutely great to take her everywhere. And it should last us many more months. A strong recommendation for parents out there. &lt;/p&gt;
&lt;div style="display: flex; justify-content: center;"&gt;
&lt;blockquote class="twitter-tweet" data-height="50"&gt;&lt;p lang="en" dir="ltr"&gt;Today is the day I canceled copilot.&lt;a href="https://t.co/OPUIEX5eIe"&gt;https://t.co/OPUIEX5eIe&lt;/a&gt;&lt;/p&gt;&amp;mdash; Duarte (@duarteocarmo) &lt;a href="https://twitter.com/duarteocarmo/status/2027424182846492927?ref_src=twsrc%5Etfw"&gt;February 27, 2026&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;
&lt;/div&gt;

&lt;p&gt;As for tech, two interesting things have been changing in my workflow (which is still &lt;em&gt;unstable&lt;/em&gt;). The first is my increasing usage of coding agents for everyday things. For things that I find myself repeating more and more I've been using &lt;a href="https://agentskills.io/home"&gt;agent skills&lt;/a&gt;. Still bounce around different agent CLIs quite a bit, but I can largely reuse them across tools. For the past month, I've been using a mix of the &lt;a href="https://developers.openai.com/codex/app/"&gt;Codex app&lt;/a&gt;, and the &lt;em&gt;shitty-but-not-so-shitty&lt;/em&gt; coding agent &lt;a href="https://shittycodingagent.ai/"&gt;Pi&lt;/a&gt;. The second interesting thing is the increasing capability of local LLMs. For example, I completely replaced my &lt;a href="https://github.com/features/copilot"&gt;GitHub Copilot&lt;/a&gt; subscription as well, with my own fork of &lt;a href="https://github.com/ggml-org/llama.vim"&gt;llama.vim&lt;/a&gt;, called &lt;a href="https://github.com/duarteocarmo/llama.lua"&gt;llama.lua&lt;/a&gt;. &lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;Two long read recommendations for this month. The first is &lt;em&gt;&lt;a href="https://elanapearl.github.io/blog/2024/the-illustrated-alphafold/"&gt;The Illustrated AlphaFold&lt;/a&gt;&lt;/em&gt; by Elana P. Simon. She does a great illustrated rundown of the key inner workings of the famous algorithm from Google. Another recommendation is the latest article from Raschka: &lt;em&gt;&lt;a href="https://sebastianraschka.com/blog/2026/a-dream-of-spring-for-open-weight.html"&gt;A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026&lt;/a&gt;&lt;/em&gt;. It's funny how LLM architectures have changed - but really, not all that much. &lt;/p&gt;
&lt;p&gt;&lt;img alt="Alphafold" src="https://duarteocarmo.com/images/97/full_arch_for_labeling.png" /&gt;&lt;/p&gt;
&lt;p&gt;I also recommend the latest &lt;a href="https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks"&gt;bomb from Anthropic&lt;/a&gt; accusing almost every Chinese lab of calling their API to distil their own models. Isn't it funny how large labs train their models on the entire Internet, but are then very much protective when someone starts doing the same with their API? In any case - didn't Anthropic do the &lt;a href="https://www.nytimes.com/2025/09/05/technology/anthropic-settlement-copyright-ai.html"&gt;exact same thing, but with book authors&lt;/a&gt;? &lt;/p&gt;
&lt;p&gt;&lt;img alt="Cartoon" src="https://duarteocarmo.com/images/97/cartoon.jpeg" /&gt;&lt;/p&gt;
&lt;p&gt;Pretty much finished with &lt;em&gt;&lt;a href="https://www.amazon.com/LLMOps-Managing-Language-Models-Production/dp/1098154207"&gt;LLMOps&lt;/a&gt;&lt;/em&gt; from Abi Aryan. Interesting read to consolidate some of the ideas from everything I've been seeing in the field. This month I've picked up &lt;em&gt;&lt;a href="https://gregmckeown.com/books/essentialism/"&gt;Essentialism&lt;/a&gt;&lt;/em&gt; from Greg McKeown, and &lt;em&gt;&lt;a href="https://www.amazon.com/Art-Doing-Science-Engineering-Learning/dp/1732265178"&gt;The Art of Doing Science and Engineering&lt;/a&gt;&lt;/em&gt; from Richard Hamming. Not sure which one will stick yet. &lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;I have three podcast recommendations for this month. Unsurprisingly - they are all very much about LLMs, AI, and the future of technology. But I'm sure you were already expecting that. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://overcast.fm/+AAyIRRePJOw"&gt;The Evolution of Reasoning in Small Language Models with Yejin Choi&lt;/a&gt; (TWIML AI Podcast): I'm a big believer in the increasing power of small language models.  &lt;/li&gt;
&lt;li&gt;&lt;a href="https://overcast.fm/+ABKyPKQ2rGU"&gt;The third golden age of software engineering&lt;/a&gt; (The Pragmatic Engineer): How the world of Software is changing, and what to focus on. &lt;/li&gt;
&lt;li&gt;&lt;a href="https://overcast.fm/+AA5AWNElGto"&gt;Jared Sleeper on Which Software Companies Will Survive the “SaaSpocalypse”&lt;/a&gt; (Odd Lots): Software stocks have been dipping, why? &lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="watching"&gt;Watching&lt;/h2&gt;
&lt;p&gt;Every year, I try to completely avoid any sort of news related to Formula 1. I don't watch races during the year, I block out everything I possibly can about it. Why? Because I'm always waiting for the next season of Drive to Survive. You should too. Even if you're not a fan of cars (I've never been). But I'm a fan of competitive sports, and this show captures that essence flawlessly. &lt;/p&gt;
&lt;iframe width="100%" height="315" src="https://www.youtube.com/embed/T3nrpj38zSk?si=VEvyQCd3KFbZBqhV" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;That's it for the February edition of Retrospectiva, see you next month!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Bagaço: A pretraining dataset for European Portuguese</title><link href="https://duarteocarmo.com/blog/bagaco-a-pretraining-dataset-for-european-portuguese.html" rel="alternate"/><published>2026-02-23T00:00:00+01:00</published><updated>2026-02-23T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-02-23:/blog/bagaco-a-pretraining-dataset-for-european-portuguese.html</id><summary type="html">&lt;p&gt;Let's say your goal is to train a Large Language Model only on European Portuguese. Where do you start? What datasets are out there? What websites are being scraped for the large black box? &lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco"&gt;Bagaço&lt;/a&gt; - named after the popular Portuguese moonshine - is a small step in that direction. &lt;/p&gt;
&lt;p&gt;&lt;img alt="Domain scatter" src="https://duarteocarmo.com/images/96/domain_scatter.webp" /&gt;&lt;/p&gt;
&lt;p&gt;In June …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Let's say your goal is to train a Large Language Model only on European Portuguese. Where do you start? What datasets are out there? What websites are being scraped for the large black box? &lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco"&gt;Bagaço&lt;/a&gt; - named after the popular Portuguese moonshine - is a small step in that direction. &lt;/p&gt;
&lt;p&gt;&lt;img alt="Domain scatter" src="https://duarteocarmo.com/images/96/domain_scatter.webp" /&gt;&lt;/p&gt;
&lt;p&gt;In June '25, the Hugging Face team released &lt;a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb-2#%F0%9F%A5%82-fineweb2"&gt;FineWeb2&lt;/a&gt;. FineWeb2 succeeded &lt;a href="https://huggingface.co/datasets/HuggingFaceFW/fineweb"&gt;FineWeb&lt;/a&gt;, a massive cleaned up dataset of the entire internet. Think millions of web pages, specifically gathered to train Large Language Models. FineWeb2 expanded FineWeb, by adding documents in hundreds of other languages. 199 million of those are in Portuguese, but how many of them are really from Portugal, or written in European Portuguese? That's how &lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco"&gt;Bagaço&lt;/a&gt; was born. &lt;/p&gt;
&lt;p&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&lt;iframe
  src="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco/embed/viewer/sample/train"
  frameborder=""
  width="100%"
  height="460px"
&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;Bagaço keeps only the 16 million documents that match domains from Portugal (e.g., &lt;code&gt;.pt&lt;/code&gt;). But it doesn't stop there. The goal of Bagaço is also to tell us a bit more about the data. I built two classifiers: the first classifies every document into one of nine categories: Society, Arts, Business, Science, Sports, Lifestyle, Health, Games, News. The second attributes an &lt;em&gt;educational score&lt;/em&gt; to each document (from 0 to 5) this gives an educational "value" to each web page. &lt;a href="https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier"&gt;Very inspired by the FineWeb-Edu work from the HF team&lt;/a&gt;. These were trained in a similar fashion: an LLM to annotate a large sample (with Gemini and Qwen), and a balanced Logistic Regression applied on &lt;code&gt;e5&lt;/code&gt; embeddings (more info on those &lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco#document-classification"&gt;here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;img alt="Volume and score" src="https://duarteocarmo.com/images/96/volume_educational_score.webp" /&gt;&lt;/p&gt;
&lt;p&gt;It's an interesting subset to analyze. The two most popular domains are &lt;code&gt;desporto.sapo.pt&lt;/code&gt; and &lt;code&gt;record.pt&lt;/code&gt;: sports news websites (I don't think I expected anything different from Portugal). The dataset is nicely distributed across categories, but something's a bit rotten. The average educational score of the dataset is only 0.85 out of 5 (!). &lt;/p&gt;
&lt;p&gt;With FineWeb-Edu, the Hugging Face team showed that if they &lt;a href="https://arxiv.org/abs/2406.17557"&gt;filtered a dataset for educational scores above 3.0&lt;/a&gt;, the downstream LLMs had dramatically better performance. Well, if we filter Bagaço by the same criteria, I don't think we would have much left (around 1M documents, that's &lt;em&gt;it&lt;/em&gt;). If you're interested in more stats and facts, here's a &lt;a href="https://static.marimo.app/static/feafeafe-tfm7"&gt;Marimo notebook&lt;/a&gt; with an analysis. (&lt;a href="https://huggingface.co/datasets/duarteocarmo/fineweb2-bagaco/tree/main/scripts"&gt;Here are several scripts&lt;/a&gt; to create the dataset, train/run the classifiers, or to make Bagaço into an easily queryable &lt;a href="https://duckdb.org"&gt;DuckDB&lt;/a&gt; database)&lt;/p&gt;
&lt;p&gt;&lt;img alt="Volume and score" src="https://duarteocarmo.com/images/96/category_score_distribution.webp" /&gt;&lt;/p&gt;
&lt;p&gt;What if we just ignored the fact that the educational score for Bagaço is pretty low and pretrained an LLM on it? (For the record, I did - it wasn't good). Bagaço has something like ~7 Billion words, something like ~9 Billion tokens of text. According to &lt;a href="https://arxiv.org/abs/2203.15556"&gt;Chinchilla&lt;/a&gt; (~20 tokens per parameter), we could train an LLM of ~450 million parameters. Not an Opus competitor by any means - could still be an interesting model. &lt;/p&gt;
&lt;p&gt;You see, there's still much to do in this space. Most high quality data ablations have been focused on the English language. And sure – there's a lot of English out there – but there's surely a lot of European Portuguese data out there too! We just have to find it! We probably can't get away with just filtering a dataset by &lt;code&gt;.pt&lt;/code&gt; domains. We might need something more sophisticated. But data is not the only bottleneck, we also need better &lt;a href="https://duarteocarmo.com/blog/a-benchmark-for-language-models-on-european-portuguese"&gt;evaluations&lt;/a&gt; for European Portuguese. &lt;/p&gt;
&lt;p&gt;Bagaço is only the first try. The potential is huge! Time to get to work. &lt;/p&gt;</content><category term="blog"/></entry><entry><title>Retrospectiva #4</title><link href="https://duarteocarmo.com/blog/retrospectiva-4.html" rel="alternate"/><published>2026-01-28T00:00:00+01:00</published><updated>2026-01-28T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-01-28:/blog/retrospectiva-4.html</id><summary type="html">&lt;p&gt;&lt;img alt="Adriatic coast" src="https://duarteocarmo.com/images/95/cover.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Well—that was quick. Just like that, the first month of the year is gone. Winter blues would normally peak around this time. But not this year. We're spending time with family in my favorite place on Earth, somewhere along the Adriatic coast of Italy, in the &lt;a href="https://en.wikipedia.org/wiki/Marche"&gt;Marche&lt;/a&gt; region.&lt;/p&gt;
&lt;p&gt;It's …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img alt="Adriatic coast" src="https://duarteocarmo.com/images/95/cover.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Well—that was quick. Just like that, the first month of the year is gone. Winter blues would normally peak around this time. But not this year. We're spending time with family in my favorite place on Earth, somewhere along the Adriatic coast of Italy, in the &lt;a href="https://en.wikipedia.org/wiki/Marche"&gt;Marche&lt;/a&gt; region.&lt;/p&gt;
&lt;p&gt;It's hard to give into winter blues when you get to run every day around the beautiful hills of the &lt;a href="https://www.parcodelconero.org/"&gt;Conero Regional Park&lt;/a&gt;. If you read this, don't tell anyone please: let them visit the usual suspects instead.&lt;/p&gt;
&lt;p&gt;Here, life moves slow. You &lt;em&gt;enjoy&lt;/em&gt; things. You &lt;em&gt;savor&lt;/em&gt; things. You talk loud. You socialize. You live for 100 years. It's Italy.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;I would be lying if I told you I haven't been using a &lt;a href="https://duarteocarmo.com/blog/how-to-police-your-agents"&gt;bunch of agents&lt;/a&gt;. I've been using LLM based tools for the better part of 2025. But something clicked towards the end, and I haven't looked back since. As for my post on &lt;a href="https://duarteocarmo.com/blog/how-to-police-your-agents"&gt;policing agents&lt;/a&gt;, my fear is still the same. That &lt;em&gt;quantity&lt;/em&gt; and &lt;em&gt;quality&lt;/em&gt; are inversely proportional. All I can do is find out ways of keeping quality high. It's nice to fire them off to do work I'm not interested in doing. Like redesigning this website for the Nth time. Or correcting typos in all my blog posts since the beginning of time.&lt;/p&gt;
&lt;p&gt;On another note. We always travel light, &lt;em&gt;even&lt;/em&gt; with a newborn. This means I have to pack just enough things - especially running gear. The &lt;a href="https://www.newbalance.com/pd/fuelcell-rebel-v5/MFCXV5-50690.html#dwvar_MFCXV5-50690_style=MFCXLV5&amp;amp;dwvar_MFCXV5-50690_width=D&amp;amp;pid=MFCXV5-50690&amp;amp;quantity=1"&gt;FuelCell Rebel v5&lt;/a&gt; have been a pleasant surprise. I'm not loyal to brands but this model won me over. They slip where it's muddy though, so beware.&lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;Read more short form than long form this month. I continue going through the &lt;a href="https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook"&gt;fabulous long form writing&lt;/a&gt; from the Huggingface team. Lots of small tips and tricks, and scars on how to train Large Language models. Elon is a divisive character, but one thing I do appreciate is the opportunity of reading through the &lt;a href="https://github.com/xai-org/x-algorithm"&gt;recommendation algorithm&lt;/a&gt; that powers X.com in my free time. Finally, the GPTZero folks have written this great "&lt;a href="https://gptzero.me/news/neurips/"&gt;GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers&lt;/a&gt;" piece about hallucinated references in NeurIPS papers. The world is getting filled with AI generated content.&lt;/p&gt;
&lt;p&gt;&lt;img alt="FineWeb chart" src="https://duarteocarmo.com/images/95/fineweb-chart.png" /&gt;&lt;/p&gt;
&lt;figcaption&gt;From &lt;a href="https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1"&gt;🍷 FineWeb: decanting the web for the finest text data at scale&lt;/a&gt;&lt;/figcaption&gt;

&lt;p&gt;I'm still finishing up both books I talked about in the last issue: &lt;a href="https://en.wikipedia.org/wiki/Old_Man%27s_War"&gt;Old Man's War&lt;/a&gt; from John Scalzi and &lt;a href="https://www.oreilly.com/library/view/llmops/9781098154196/"&gt;LLMOps&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;I follow a lot of DJs on SoundCloud. One that has consistently released good mixes is &lt;a href="https://www.instagram.com/sashamvrie/"&gt;Sasha Marie&lt;/a&gt;. It's the best soundtrack for doing deep work. Whatever it is you're doing, you'll do it better when this is in the background:&lt;/p&gt;
&lt;iframe width="100%" height="300" scrolling="no" frameborder="no" allow="autoplay" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/soundcloud%253Atracks%253A2251134248&amp;color=%23ff5500&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;show_teaser=true&amp;visual=true"&gt;&lt;/iframe&gt;
&lt;div style="font-size: 10px; color: #cccccc;line-break: anywhere;word-break: normal;overflow: hidden;white-space: nowrap;text-overflow: ellipsis; font-family: Interstate,Lucida Grande,Lucida Sans Unicode,Lucida Sans,Garuda,Verdana,Tahoma,sans-serif;font-weight: 100;"&gt;&lt;a href="https://soundcloud.com/sashamarieradio" title="SASHA MARIE RADIO" target="_blank" style="color: #cccccc; text-decoration: none;"&gt;SASHA MARIE RADIO&lt;/a&gt; · &lt;a href="https://soundcloud.com/sashamarieradio/chapter-134" title="Chapter #134" target="_blank" style="color: #cccccc; text-decoration: none;"&gt;Chapter #134&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;Two other recommendations that have been on repeat this month. A bit of &lt;em&gt;modern&lt;/em&gt;, and a bit of smooth.&lt;/p&gt;
&lt;iframe data-testid="embed-iframe" style="border-radius:12px" src="https://open.spotify.com/embed/track/3PJV3HC2LogvCrS6hDs2el?utm_source=generator&amp;theme=0" width="100%" height="152" frameBorder="0" allowfullscreen="" allow="autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture" loading="lazy"&gt;&lt;/iframe&gt;
&lt;iframe data-testid="embed-iframe" style="border-radius:12px" src="https://open.spotify.com/embed/track/2C5M5oQOLjaTiBAInYxEty?utm_source=generator&amp;theme=0" width="100%" height="152" frameBorder="0" allowfullscreen="" allow="autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture" loading="lazy"&gt;&lt;/iframe&gt;

&lt;p&gt;On another note, two podcast recommendations this month:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=NWZZEa9BURw"&gt;This interview from The Knowledge Project Podcast with Morgan Housel&lt;/a&gt;: I love how Morgan thinks, talks, and writes. His books are gold - and no wonder he's sold so many. He just resonates.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=xoynR-hWNZY"&gt;State of Agentic Coding with Armin and Ben&lt;/a&gt;: &lt;a href="https://lucumr.pocoo.org/"&gt;Armin Ronacher&lt;/a&gt; is one of those people at the bleeding edge of agentic workflows. Always interesting to understand how he's thinking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="watching"&gt;Watching&lt;/h2&gt;
&lt;p&gt;While in Portugal, Vitto and I binged &lt;a href="https://www.rottentomatoes.com/tv/the_pitt"&gt;The Pitt&lt;/a&gt; on HBO Max. It's like Grey's Anatomy, on steroids, at 4x speed, and 5x the intensity, with a lot less drama. Or at least the soap opera one. We haven't finished it, but definitely recommend it.&lt;/p&gt;
&lt;iframe width="100%" height="315" src="https://www.youtube.com/embed/ufR_08V38sQ?si=vG6TXcfs7fsWJwW_&amp;amp;controls=0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;Since we're in Italy, and it's Australian Open time, we've been watching a good amount of that as well. Tennis skyrocketed here in popularity due to Sinner, Musetti, and Paolini. So when Tennis is being played, Italy stops. But it's not only sports. We also watch intellectually rewarding things. &lt;a href="https://en.wikipedia.org/wiki/Affari_tuoi"&gt;But not this month&lt;/a&gt;.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>How to police your agents</title><link href="https://duarteocarmo.com/blog/how-to-police-your-agents.html" rel="alternate"/><published>2026-01-23T00:00:00+01:00</published><updated>2026-01-23T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-01-23:/blog/how-to-police-your-agents.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/94/police-agents.jpg" alt="Monahan the Police Chief" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;p&gt;Let's face it. It &lt;em&gt;might just be&lt;/em&gt; the year of agents. If you work in tech and your workflow hasn't changed in the last year or so - you're probably doing something wrong. For those of you who have. It's fun. We're building more than ever before!&lt;/p&gt;
&lt;p&gt;But there's a dark …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/94/police-agents.jpg" alt="Monahan the Police Chief" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;p&gt;Let's face it. It &lt;em&gt;might just be&lt;/em&gt; the year of agents. If you work in tech and your workflow hasn't changed in the last year or so - you're probably doing something wrong. For those of you who have. It's fun. We're building more than ever before!&lt;/p&gt;
&lt;p&gt;But there's a dark cloud in the sky. It's the pile of trash, it's the amount of unmaintainable complexity we're putting out there. Let's get the record straight — if you're writing something to throw away — you shouldn't care about any of this, but the ones that do this professionally probably should. At the end of the day - we'll have to stick around, maintain it, make sure it performs, make sure it doesn't break in unexpected ways.&lt;/p&gt;
&lt;p&gt;Either it's a training run, a large ML pipeline, a search eval. We need to guarantee the &lt;em&gt;quality&lt;/em&gt; of what we're shipping: We need to police our agents.&lt;/p&gt;
&lt;p&gt;Now how do you do that? I do it in three ways.&lt;/p&gt;
&lt;h2 id="the-constitution"&gt;The Constitution&lt;/h2&gt;
&lt;p&gt;My global &lt;code&gt;AGENTS.md&lt;/code&gt; &lt;sup id="sf-how-to-police-your-agents-1-back"&gt;&lt;a href="#sf-how-to-police-your-agents-1" class="simple-footnote" title="Or CLAUDE.md, it doesn't matter. They are all symlinked. Claude Code, OpenCode, Codex, or even pi, they all use the same set of rules."&gt;1&lt;/a&gt;&lt;/sup&gt; is where I define the most important laws I want my agents to follow. These are closely related to my personality and taste. It's a file with scars of everything I've seen wrong from the start of my career.&lt;/p&gt;
&lt;p&gt;The file has 55 lines of Markdown and grows daily. "About Your User," "Your Behavior," "Principles," "Code Search," or "Languages" are just some of the sections it has. The idea behind it is simple: whatever I’m working on should adhere to my way of working. Also, I'm peculiar about how I want my agents to interact with me. For example, I don't want it to speak Chinese to me, I don’t want it to assume I know things I know nothing about.&lt;/p&gt;
&lt;p&gt;Here are some highlights from my &lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;...

&lt;span class="gu"&gt;## Documentation and plans&lt;/span&gt;

&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;If docs were touched, make sure you update them with latest changes. You can run some quick ripgreps with md files to confirm if needed.
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;For docs: Be concise and durable - point to source code for specifics rather than hardcoding values that will get out of sync
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Every time the user starts repeating something very specific about a project, consider adding a rule to the AGENTS.md or CLAUDE.md of that project if one exists.
...

&lt;span class="gu"&gt;## Languages&lt;/span&gt;

&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;I understand portuguese, french, italian, spanish, and english. Everything else, you should show the original AND the translation
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Outputing in any other language is useless, and show come with a corresponding translation

&lt;span class="gu"&gt;## Principles&lt;/span&gt;

&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Every line of code is a liability - we should strive to make our code simple and concise
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;I prefer my functions to be small in interface but long in functionality
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;We don't MOCK in tests. We use real data and real APIs. I absolutely hate mocking
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It's a series of &lt;em&gt;very&lt;/em&gt; specific rules I want my agents to respect. You might like a different set of rules — and that's fine — I don't expect us to like the same things. But I don't believe you don't have preferences. So make sure to create one as well.&lt;/p&gt;
&lt;p&gt;This file is the Constitution. And the rules of the Constitution apply to anything I'm doing. But every project has &lt;em&gt;quirks&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id="the-rules-of-the-game"&gt;The rules of the game&lt;/h2&gt;
&lt;p&gt;Every project has something specific and annoying. More likely if you're working with a team. They can be anything: A weird commit convention, a particular way of running tests, a specific server that needs to be run, an unconventional way of making a release.&lt;/p&gt;
&lt;p&gt;Your project's &lt;code&gt;AGENTS.md&lt;/code&gt; defines the rules of the game. That's where you inform the agents about the quirks of the project. A cool little trick I use is to prompt my global &lt;code&gt;AGENTS.md&lt;/code&gt; file to update my project level &lt;code&gt;AGENTS.md&lt;/code&gt; file if anything I am doing is very repetitive.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;... Global AGENTS.md
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Every time the user starts repeating something very specific about a project, consider adding a rule to the AGENTS.md or CLAUDE.md of that project if one exists.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here's an example snippet of a real world project &lt;code&gt;AGENTS.md&lt;/code&gt;. This one is for a Machine Learning pipeline built using AWS CDK:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;...Local AGENTS.md
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;For database queries, when the user is asking questions (not for code implementations), use &lt;span class="sb"&gt;`aws rds-data execute-statement`&lt;/span&gt; with the Aurora resource ARN and Secrets Manager secret ARN.
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Code artifacts should be placed in ./code_artifacts/ directory
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Scripts should be placed in the ./scripts/ directory and should be uv scripts (https://docs.astral.sh/uv/guides/scripts/)
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;If name has an acronym, like &lt;span class="sb"&gt;`ML`&lt;/span&gt;, &lt;span class="sb"&gt;`OIDC`&lt;/span&gt; only capitalise the first letter, e.g. &lt;span class="sb"&gt;`Ml`&lt;/span&gt;, &lt;span class="sb"&gt;`Oidc`&lt;/span&gt;.
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Adhere to CDK Best Practices https://docs.aws.amazon.com/cdk/v2/guide/best-practices.html
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Ensure comments are added only when they are natural and consistent with the rest of the file
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Include defensive checks or try/catch blocks only when they align with the norms of that area of the codebase (especially if not called by trusted / validated codepaths)
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Maintain consistency with the style used throughout the file
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Add comments when we are doing something non-obvious or complex, not to separate every line of code
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Note: You might also consider using something like &lt;a href="https://code.claude.com/docs/en/skills"&gt;Skills&lt;/a&gt; to solve this level of policing. Some people prefer them. I use a mix.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This ensures that whoever is building here, will generally follow the same guidelines. And these rules can go a long way to making your life easier — but they still don't &lt;em&gt;enforce&lt;/em&gt; anything — for that, we need &lt;strong&gt;the police&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="the-police"&gt;The police&lt;/h2&gt;
&lt;p&gt;Policing is all about enforcing rules. What better way to enforce rules for code, than with static code analysis? Before agents, I used to find these annoying. But nowadays, if there are 154 line-length issues in the project, I can just spin off an agent to solve it.&lt;/p&gt;
&lt;p&gt;And so, I am &lt;em&gt;very&lt;/em&gt; picky when policing agents. Let's go through a Python example.&lt;/p&gt;
&lt;p&gt;Everything starts (as always) with a &lt;code&gt;Makefile&lt;/code&gt;. This is where I define the main commands to be run. I always include a &lt;code&gt;format&lt;/code&gt; and a &lt;code&gt;check&lt;/code&gt; command. Remember, we are policing, we are picky, we are annoying: ruff, Pyright, deptry, we run it all.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nf"&gt;.PHONY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;
&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Format the codebase with ruff&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;lock&lt;span class="w"&gt; &lt;/span&gt;--check
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;check&lt;span class="w"&gt; &lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;--fix
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;format&lt;span class="w"&gt; &lt;/span&gt;.

&lt;span class="nf"&gt;.PHONY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;check&lt;/span&gt;
&lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Run linting and check&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;lock&lt;span class="w"&gt; &lt;/span&gt;--check
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;format&lt;span class="w"&gt; &lt;/span&gt;--check&lt;span class="w"&gt; &lt;/span&gt;.
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;check&lt;span class="w"&gt; &lt;/span&gt;.
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;pyright
&lt;span class="w"&gt; &lt;/span&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;deptry&lt;span class="w"&gt; &lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We are so annoying, that we enforce more rules that we would enforce to ourselves. So, for example, for ruff, we extend the rules to avoid our agent dropping a sneaky TODO comment instead of implementing something that should be there.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;[tool.ruff.lint]&lt;/span&gt;
&lt;span class="n"&gt;extend-select&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"F"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Pyflakes rules&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"W"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# PyCodeStyle warnings&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"E"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# PyCodeStyle errors&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"I"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Sort imports properly&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"UP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Warn if certain things can changed due to newer Python versions&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"C4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Catch incorrect use of comprehensions, dict, list, etc&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"FA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Enforce from __future__ import annotations&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"ISC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Good use of string concatenation&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"ICN"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Use common import conventions&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"RET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Good return practices&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"SIM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Common simplification rules&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"TID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Some good import practices&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"TC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Enforce importing certain types in a TYPE_CHECKING block&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"PTH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Use pathlib instead of os.path&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"TD"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Be diligent with TODO comments&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"NPY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# Some numpy-specific things&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Note: Shamelessly stolen from reddit user TheBB&lt;/span&gt;

&lt;span class="c1"&gt;# https://www.reddit.com/r/Python/comments/1kttfst/ruff_users_what_rules_are_using_and_what_are_you/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It's a nice, programmatic, and deterministic way of enforcing rules without having to write each one into an &lt;code&gt;AGENTS.md&lt;/code&gt; and hope the agent follows them. With that, every time the agent writes some code, it runs &lt;code&gt;make format&lt;/code&gt; and then &lt;code&gt;make check&lt;/code&gt;. If anything fails, it fixes it.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;We are becoming faster and faster at producing software. By now, that’s not &lt;em&gt;a possibility&lt;/em&gt;, that’s a fact. But the problem was never writing the code. It was always about two things: (1) knowing what code &lt;em&gt;not&lt;/em&gt; to write and (2) managing complexity. And what better way of managing complexity than by diligently &lt;em&gt;policing&lt;/em&gt; your agents?&lt;/p&gt;
&lt;p&gt;Vibe, but not too close to the sun.&lt;/p&gt;
&lt;hr&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-how-to-police-your-agents-1"&gt;Or &lt;code&gt;CLAUDE.md&lt;/code&gt;, it doesn't matter. They are all symlinked. &lt;a href="https://claude.ai/code"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://opencode.ai/"&gt;OpenCode&lt;/a&gt;, &lt;a href="https://developers.openai.com/codex/cli/"&gt;Codex&lt;/a&gt;, or even &lt;a href="https://shittycodingagent.ai/"&gt;pi&lt;/a&gt;, they all use the same set of rules. &lt;a href="#sf-how-to-police-your-agents-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>Limpa: Ad-Free podcasts powered by LLMs</title><link href="https://duarteocarmo.com/blog/limpa-ad-free-podcasts-powered-by-llms.html" rel="alternate"/><published>2026-01-05T00:00:00+01:00</published><updated>2026-01-05T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2026-01-05:/blog/limpa-ad-free-podcasts-powered-by-llms.html</id><summary type="html">&lt;p&gt;I get up feeling sleepy. I lace up my running shoes and head out the door. I fire up my favourite podcast. "This show is brought to you by..." I &lt;em&gt;hate&lt;/em&gt; ads.&lt;/p&gt;
&lt;p&gt;I understand the attention economy. Companies are capitalizing more and more on everyone's time. I have nothing against …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I get up feeling sleepy. I lace up my running shoes and head out the door. I fire up my favourite podcast. "This show is brought to you by..." I &lt;em&gt;hate&lt;/em&gt; ads.&lt;/p&gt;
&lt;p&gt;I understand the attention economy. Companies are capitalizing more and more on everyone's time. I have nothing against it. But I'm also a big fan of protecting my time. A &lt;a href="https://pi-hole.net/"&gt;Pi-hole&lt;/a&gt; under my desk, &lt;a href="https://ublockorigin.com/"&gt;uBlock Origin&lt;/a&gt;, &lt;a href="https://sponsor.ajay.app/"&gt;SponsorBlock&lt;/a&gt;, I run them all. My time is mine, unless I tell you otherwise.&lt;/p&gt;
&lt;p&gt;And I spend a lot of time listening to Podcasts. And what are podcasts at the end of the day? Just mp3 files. Once you download them from the server, they're yours.&lt;/p&gt;
&lt;p&gt;So I built &lt;a href="https://github.com/duarteocarmo/limpa"&gt;Limpa&lt;/a&gt;&lt;sup id="sf-limpa-ad-free-podcasts-powered-by-llms-1-back"&gt;&lt;a href="#sf-limpa-ad-free-podcasts-powered-by-llms-1" class="simple-footnote" title='It means "clean" in Portuguese'&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/duarteocarmo/limpa" target="_blank"&gt;
&lt;/a&gt;&lt;/p&gt;&lt;center&gt;&lt;a href="https://github.com/duarteocarmo/limpa" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/93/limpa_screenshot.png" alt="Limpa screenshot" style="max-width: 100%; border-radius: 2px"&gt;
&lt;/a&gt;&lt;/center&gt;&lt;a href="https://github.com/duarteocarmo/limpa" target="_blank"&gt;
&lt;/a&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Limpa is a simple yet powerful web app. You paste in the &lt;a href="https://rss.com/tools/find-my-feed/"&gt;RSS feed&lt;/a&gt; of your favourite podcast, and it will give you back an ad-free feed that you can plug into your favourite Podcast app.&lt;/p&gt;
&lt;p&gt;It's a simple &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; app with &lt;a href="https://htmx.org/"&gt;htmx&lt;/a&gt; on the front-end. A stack I love since it has few moving pieces. Every time a podcast gets added, I transcribe the latest episode using &lt;a href="https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3"&gt;NVIDIA's Parakeet v3&lt;/a&gt; running on &lt;a href="https://modal.com/"&gt;Modal&lt;/a&gt;. Once I extract the &lt;a href="https://github.com/duarteocarmo/limpa/blob/master/limpa/services/modal_transcription.py#L72"&gt;transcript with timestamps&lt;/a&gt;, I can &lt;a href="https://github.com/duarteocarmo/limpa/blob/master/limpa/services/extract.py#L48"&gt;prompt an LLM&lt;/a&gt; to get the exact timestamps of the ads that should be cut out. I then pass those to &lt;a href="https://www.ffmpeg.org/"&gt;ffmpeg&lt;/a&gt; and voilà: ad-free podcasts.&lt;/p&gt;
&lt;p&gt;Django's new &lt;a href="https://docs.djangoproject.com/en/6.0/topics/tasks/"&gt;Tasks framework&lt;/a&gt; has also been a joy to use. No more &lt;a href="https://docs.celeryq.dev/en/v5.5.3/django/first-steps-with-django.html"&gt;Celery&lt;/a&gt;, no more &lt;a href="https://flower.readthedocs.io/en/latest/"&gt;Flower&lt;/a&gt;, just a simple database to run background tasks. The fewer moving pieces, the fewer mistakes &lt;a href="https://opencode.ai/"&gt;my agents&lt;/a&gt; are prone to making.&lt;/p&gt;
&lt;p&gt;If you're wondering - no - I'm not planning on hosting this service to others. I'm not interested in discussing the nuances and implications of providing this as a service to others.&lt;sup id="sf-limpa-ad-free-podcasts-powered-by-llms-2-back"&gt;&lt;a href="#sf-limpa-ad-free-podcasts-powered-by-llms-2" class="simple-footnote" title="That might be illegal, and you probably shouldn't do it"&gt;2&lt;/a&gt;&lt;/sup&gt;. The project's simple and &lt;a href="https://github.com/duarteocarmo/limpa"&gt;well documented&lt;/a&gt;, and so it should be easy for you to run it for yourself.&lt;/p&gt;
&lt;p&gt;Feel free to fork, and build on top of it!&lt;/p&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-limpa-ad-free-podcasts-powered-by-llms-1"&gt;It means "clean" in Portuguese &lt;a href="#sf-limpa-ad-free-podcasts-powered-by-llms-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-limpa-ad-free-podcasts-powered-by-llms-2"&gt;That might be illegal, and you probably shouldn't do it &lt;a href="#sf-limpa-ad-free-podcasts-powered-by-llms-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>Retrospectiva #3</title><link href="https://duarteocarmo.com/blog/retrospectiva-3.html" rel="alternate"/><published>2025-12-25T00:00:00+01:00</published><updated>2025-12-25T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-12-25:/blog/retrospectiva-3.html</id><summary type="html">&lt;p&gt;And just like that, it's the end of the year again. We did not expect to make it to the South for Christmas this year. Getting a passport for a newborn is a painful, bureaucracy-filled process, especially for a Portuguese-Italian baby born in Denmark. But somehow—magically—the Italian embassy …&lt;/p&gt;</summary><content type="html">&lt;p&gt;And just like that, it's the end of the year again. We did not expect to make it to the South for Christmas this year. Getting a passport for a newborn is a painful, bureaucracy-filled process, especially for a Portuguese-Italian baby born in Denmark. But somehow—magically—the Italian embassy came through in 2 days (looking at you and your 4-month wait, Portugal). And so we flew down and surprised everyone.&lt;/p&gt;
&lt;p&gt;Lisbon. It's loud, dirty, &lt;a href="https://www.theguardian.com/commentisfree/2025/jun/25/lisbon-europe-portugal-golden-visa-capital-investors-short-term-rentals"&gt;well-known for all the wrong reasons&lt;/a&gt;, but it's &lt;em&gt;home&lt;/em&gt;. I couldn't be happier to bring Allegra down to spend her first Christmas with the family.&lt;/p&gt;
&lt;p&gt;It's the end of the year, before we get to the &lt;em&gt;regularly scheduled&lt;/em&gt; Retrospectiva, let's talk a bit about 2025.&lt;/p&gt;
&lt;h2 id="2025-look-back"&gt;2025 look back&lt;/h2&gt;
&lt;p&gt;2025 was an eventful year. On the personal side, one of the most eventful years in a while. We got married and managed to put everyone we love in a single place. We travelled just the right amount. I've read just the right amount of books - and started reading more fiction.&lt;/p&gt;
&lt;p&gt;As for running, it was a transition year. I used to run two marathons every year. Recently, I decided to tone that down to just one. This year I ran the &lt;a href="https://oslomaraton.no/en/"&gt;Oslo marathon&lt;/a&gt;, and a half-marathon across the &lt;a href="https://brolobet2025.dk/"&gt;Øresund&lt;/a&gt; bridge. No clean training block for neither of them. I did try to keep my mileage relatively high. As I'm writing this, I've ran 1937 km this year. Pretty close to my 2000 km target for the year. Maybe I can still make it.&lt;/p&gt;
&lt;p&gt;2025 was &lt;em&gt;also&lt;/em&gt; a challenging year for our family. I won't get into details. A lot of unexpected things happened. Thankfully, we all managed to pull through. The year finished with a bang and 'Legra is here with us now.&lt;/p&gt;
&lt;p&gt;In what regard work, the year was busy. The stakes were higher, and more was expected. Larger scale, more challenging problems, and lots of new problems. Two main themes that I'm noticing. The first is that LLM-backed applications are growing more than ever - but most don't really know how to measure and improve them. The second is the &lt;a href="https://x.com/duarteocarmo/status/2001566862182879738"&gt;great cleanup&lt;/a&gt; - which is a topic for a future post.&lt;/p&gt;
&lt;p&gt;Now, in the end of December, we take a deep breath - stop to think - and put some goals in the board for 2026.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Oura ring&lt;/strong&gt; (&lt;a href="https://ouraring.com/store/rings"&gt;link&lt;/a&gt;): For years I've been a happy Garmin user. 10-day battery life, great for tracking my running. But there's one thing that always annoyed me: sleep tracking. If I wake up at 4 AM and get back to bed, it doesn't mean I slept 5 hours. Also, Garmin has a &lt;em&gt;passive&lt;/em&gt; approach to giving me insights. I can go and browse the graphs - but they won't proactively tell me how I'm doing. I hate subscriptions. The &lt;a href="https://us.amazfit.com/products/helio-strap"&gt;Helio Strap&lt;/a&gt; was good. But still passive - so I sold it. The Oura &lt;em&gt;delivers&lt;/em&gt;. The app is well designed, the sleep tracking is accurate, and the insights are actually actionable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bimby TM7&lt;/strong&gt; (&lt;a href="https://www.vorwerk.com/pt/pt/c/home/produtos/bimby/tm7"&gt;link&lt;/a&gt;): I used to cook the &lt;em&gt;exact same&lt;/em&gt; pasta dish every single day. Food is important - I just never cared. Cooking was just always too abstract. A "spoon" of this, a "pinch" of that - what the hell do you mean? The Bimby has removed any and all friction in getting an actual dish ready. You can browse through thousands of recipes and I don't need to think, I can just follow the instructions. The iOS app is also great - and warns me when I need to come back to it and do something. Its stupid expensive - but we are incredibly happy with it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OpenCode&lt;/strong&gt; (&lt;a href="https://opencode.ai/"&gt;link&lt;/a&gt;): I've told you before: at this pace of change my editing workflow changes almost every week. In the past few weeks I've caught myself ditching Claude Code and Codex more and more for OpenCode. The terminal UI is beautifully. It's &lt;a href="https://github.com/sst/opencode"&gt;100% open source&lt;/a&gt;. It allows me to use any model I want. This means any &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt; model, or - even better - any model already included in my GitHub Copilot plan. I don't need another subscription. Especially for something closed source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Neovim Pack&lt;/strong&gt; (&lt;a href="https://neovim.io/doc/user/pack.html"&gt;link&lt;/a&gt;): I live mostly in the terminal. Not because I think it's cool. But because that's where I'm more productive. For the past years I've used &lt;a href="https://www.lazyvim.org/"&gt;LazyVim&lt;/a&gt; for &lt;a href="https://github.com/duarteocarmo/dotfiles/tree/master/.config/nvim"&gt;my config&lt;/a&gt;, but something started to feel bloated. Too many things going on, too many bells and whistles I don't need. I want to get in, change, get out. Neovim now has a &lt;a href="https://neovim.io/doc/user/pack.html"&gt;native package manager,&lt;/a&gt; and I've started building my own minimal config with it (currently &lt;a href="https://github.com/duarteocarmo/dotfiles/blob/master/.config/nvim-minimal/init.lua"&gt;100 lines of Lua&lt;/a&gt;- and work in progress).&lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Old Man's War - John Scalzi&lt;/strong&gt; (&lt;a href="https://www.goodreads.com/book/show/36510196-old-man-s-war"&gt;link&lt;/a&gt;): Still working my way through this one. I knew Scalzi wouldn't let me down. I love his direct, no-bs writing style. Nothing beats ending the day with a book set somewhere very, &lt;em&gt;very&lt;/em&gt; faraway. There are some interesting parallels to our world right now. Everyone's got their own personal AI assistant (which the main character nicknames "Asshole"), everyone has to sign a user agreement to use their new body. It almost feels like non-fiction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LLMOps - Abi Aryan&lt;/strong&gt; (&lt;a href="https://www.amazon.com/LLMOps-Managing-Language-Models-Production/dp/1098154207#customerReviews"&gt;link&lt;/a&gt;): My technical read for the holiday season, the next on my technical book backlog. I've been a reader of Abi's &lt;a href="https://modelcraft.substack.com/"&gt;ModelCraft&lt;/a&gt; newsletter, and have high hopes for this one.&lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;I've been listening to a couple of different music "areas". I don't think we can call them genres.&lt;/p&gt;
&lt;p&gt;The first one is &lt;strong&gt;Italian indie alternative&lt;/strong&gt;. One of my favourite things about speaking another language is that you get to immerse yourself in the culture that speaks it, which includes - obviously - their music. I've spent the last few years exploring most of what the Italian scene has to offer, but lately it's been a mix of &lt;a href="https://open.spotify.com/artist/3r7ayqsHrVs1aeD7SiBTsz?si=g1TEFkFfQaeggjUn6R0ypg"&gt;irossa&lt;/a&gt;, some &lt;a href="https://open.spotify.com/track/3RpG5glcogbXModIh6P87R?si=94db23bf81264bc2"&gt;weird things&lt;/a&gt;, and &lt;a href="https://open.spotify.com/artist/0ErDKYNv448COBCNdnqYIm?si=BVcyo6ipTPCM97Co4Y5f_g"&gt;Marco Castello&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The second one I call &lt;strong&gt;jazz ambient&lt;/strong&gt;. When I play music (almost always), I tend to play it all over the house. Recently, both &lt;a href="https://open.spotify.com/artist/2h5syT5XdsQgKLq8Yn1klO?si=HBNSUbwURWGRA1vervExSQ"&gt;Nala Sinephro&lt;/a&gt; and &lt;a href="https://open.spotify.com/artist/2eO4klJg324zroGqnBkqk3?si=OdCnbiz4Q86HcVcXcct39g"&gt;cktrl&lt;/a&gt; have been on heavy rotation. They surely put me in the Christmas spirit.&lt;/p&gt;
&lt;div style="display: flex; flex-wrap: wrap; gap: 12px;"&gt;
  &lt;iframe style="border-radius:12px; flex:1 1 300px; min-width:0;" src="https://open.spotify.com/embed/track/3zrfPWeaPgpUZyu19BjBTf?utm_source=generator" height="352" frameBorder="0" allowfullscreen="" allow="autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture" loading="lazy"&gt;&lt;/iframe&gt;
  &lt;iframe style="border-radius:12px; flex:1 1 300px; min-width:0;" src="https://open.spotify.com/embed/track/2QUXnZuZYqc5RAgSJILpb6?utm_source=generator" height="352" frameBorder="0" allowfullscreen="" allow="autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture" loading="lazy"&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
That's it for this month's Retrospectiva. See you in 2026!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>From NutriBench to Taralli: How far can you take a prompt?</title><link href="https://duarteocarmo.com/blog/from-nutribench-to-taralli-how-far-can-you-take-a-prompt.html" rel="alternate"/><published>2025-12-23T00:00:00+01:00</published><updated>2025-12-23T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-12-23:/blog/from-nutribench-to-taralli-how-far-can-you-take-a-prompt.html</id><summary type="html">&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/91/experiment_results.png" alt="Taralli Screenshots" style="max-width: 100%; border-radius: 2px"&gt;&lt;p&gt;&lt;/p&gt;
&lt;figcaption&gt;Benchmarking calorie prediction for Taralli&lt;/figcaption&gt;
&lt;p&gt;&lt;/p&gt;&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There's something very funny about the current Machine Learning and AI landscape. If you're in the field you probably heard about it. "Vibes" they call it. When someone wants to test something out, they conduct a "vibe test".&lt;/p&gt;
&lt;p&gt;I call bullshit. How are you supposed …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/91/experiment_results.png" alt="Taralli Screenshots" style="max-width: 100%; border-radius: 2px"&gt;&lt;p&gt;&lt;/p&gt;
&lt;figcaption&gt;Benchmarking calorie prediction for Taralli&lt;/figcaption&gt;
&lt;p&gt;&lt;/p&gt;&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There's something very funny about the current Machine Learning and AI landscape. If you're in the field you probably heard about it. "Vibes" they call it. When someone wants to test something out, they conduct a "vibe test".&lt;/p&gt;
&lt;p&gt;I call bullshit. How are you supposed to improve something if you can't even measure it? How do you know you didn't just make it worse?&lt;/p&gt;
&lt;p&gt;In the past month, I've been working on &lt;a href="https://duarteocarmo.com/blog/taralli-home-cooked-food-tracking-without-the-bs"&gt;Taralli&lt;/a&gt;. And I'd like to show you how to incrementally improve an LLM based system.&lt;/p&gt;
&lt;p&gt;But first, let's take a little side quest.&lt;/p&gt;
&lt;h2 id="nutribench-evaluating-llms-on-nutrition-estimation"&gt;NutriBench: Evaluating LLMs on nutrition estimation&lt;/h2&gt;
&lt;p&gt;While browsing arXiv the other day &lt;a href="https://arxiv.org/abs/2407.12843"&gt;I stumbled upon NutriBench&lt;/a&gt;. This is exciting - I thought to myself - it looks like someone else is also looking into this problem. NutriBench is a research project launched by the University of California that looks to quantify exactly Taralli's problem: "How good are Large Language Models at estimating the nutritional content of foods?".&lt;/p&gt;
&lt;p&gt;The team created &lt;a href="https://huggingface.co/datasets/dongx1997/NutriBench/viewer/v1?views%5B%5D=v1_wweia_meal_metric"&gt;a dataset for ~12K meal descriptions&lt;/a&gt; from two data sources (&lt;a href="[https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/wweianhanes-overview/](https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/wweianhanes-overview/)"&gt;WWEIA&lt;/a&gt; and &lt;a href="[https://www.fao.org/gift-individual-food-consumption/en/](https://www.fao.org/gift-individual-food-consumption/en/)"&gt;FAO/WHO&lt;/a&gt;). Every row in the dataset is a meal description with the corresponding nutritional details (carbs, protein, fat, and energy). To answer the question, they tested 4 different prompting techniques and 12 different LLMs. The prompting techniques were the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Base&lt;/strong&gt;: A simple prompt with some few-shot examples.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chain-of-Thought (CoT)&lt;/strong&gt;: A slightly more complex prompt with some reasoning steps to "force" the model to reason.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt;: A system that uses a retrieval database (which they called Retri-DB), to aid the LLM in estimating carbs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RAG+CoT&lt;/strong&gt;: A mix of the former two.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/duarteocarmo/e9d89bd75363be3540fcdf8074e71100"&gt;If you are curious about the prompts, click here.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/91/nutribench_results.png" alt="NutriBench results" style="max-width: 100%;border-radius: 2px"&gt;&lt;p&gt;&lt;/p&gt;
&lt;figcaption&gt;NutriBench results&lt;/figcaption&gt;
&lt;p&gt;&lt;/p&gt;&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;You're probably wondering: okay, but how good are the models at this? The answer is a resounding &lt;em&gt;MEH&lt;/em&gt;. The best performer is GPT-4o with a Chain-of-Thought prompt. It responds to ~99% of prompts (models can refuse to answer) and achieves an Acc@7.5 of 66.82%&lt;sup id="sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-1-back"&gt;&lt;a href="#sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-1" class="simple-footnote" title="For reference, this is on par with a Human nutritionist with internet access"&gt;1&lt;/a&gt;&lt;/sup&gt;. In other words, &lt;strong&gt;its carb predictions fall within ±7.5g of the actual value about two-thirds of the time&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;What if we could use the NutriBench dataset to improve Taralli?&lt;/p&gt;
&lt;h2 id="improving-tarallis-nutritional-estimation"&gt;Improving Taralli's nutritional estimation&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/duarteocarmo/58fda6d982fb6d9dd5bfb561214a77ce"&gt;Here's the notebook with the entire process&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The first thing, as always, is to create good dataset. I created a new dataset of 107 examples of food descriptions and their corresponding total calories&lt;sup id="sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-2-back"&gt;&lt;a href="#sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-2" class="simple-footnote" title="I could have used more, but wanted to keep things fast"&gt;2&lt;/a&gt;&lt;/sup&gt;. I used the &lt;a href="https://huggingface.co/datasets/dongx1997/NutriBench"&gt;second version of the NutriBench&lt;/a&gt; dataset on Hugging Face. I mixed in that dataset with some other examples from a previous golden dataset.&lt;/p&gt;
&lt;details&gt;
  &lt;summary&gt;Expand to see some examples of the dataset&lt;/summary&gt;
  &lt;pre&gt;&lt;code&gt;
 [
  {
    "input": "For dinner, I'm having 25 grams of bread, 150 grams of chicken wings, and a 250-gram mixed vegetable salad.",
    "total_calories": 633.0,
    "food_groups": [
      "grain",
      "meat and alternatives",
      "vegetable"
    ],
    "source": "nutribench"
  },
  {
    "input": "I enjoyed 200 grams of tea with sugar along with 230 grams of coconut milk rice for breakfast.",
    "total_calories": 558.0,
    "food_groups": [
      "grain",
      "fruit"
    ],
    "source": "nutribench"
  },
  {
    "input": "stracciatella with confit grapes and 2 little focaccia pieces",
    "total_calories": 350.0,
    "food_groups": [
      "fruit",
      "dairy",
      "grain"
    ],
    "source": "golden_dataset"
  }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;

&lt;p&gt;Now I needed to design an evaluation metric. For NutriBench they used accuracy of carb prediction at ±7.5 grams. In our case, I'm more interested in the accuracy calorie prediction. One of the nice things about DSPy is that it forces you to write down your evaluation metric explicitly. Below is the evaluation metric I used: In short, we want the predicted calories to be within 10% of the ground truth calories. Which means our metric will then be Accuracy@10%.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred_trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prediction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# see notebook for more detailed metric (this is simplified)&lt;/span&gt;
    &lt;span class="n"&gt;total_calories_example&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_calories&lt;/span&gt;
    &lt;span class="n"&gt;total_calories_predicted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;na&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_calories&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;within_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_calories_example&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;total_calories_predicted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nb"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="mf"&gt;0.1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;total_calories_example&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;within_threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;feedback_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"INCORRECT: Your answer was not within 10% of the correct answer (yours: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_calories_predicted&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, correct: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_calories_example&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prediction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;feedback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;feedback_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;feedback_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"CORRECT: Your answer was within 10% of the correct answer (yours: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_calories_predicted&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, correct: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_calories_example&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prediction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;feedback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;feedback_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here are the different prompts I tested:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vanilla&lt;/strong&gt;: Zero shot prompt using the DSPy format.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bootstrap few shot golden only&lt;/strong&gt;: Using the golden dataset I previously collected. This is what was running in production for Taralli. &lt;a href="https://dspy.ai/api/optimizers/BootstrapFewShotWithRandomSearch/"&gt;BootstrapFewShotWithRandomSearch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bootstrap few mixed&lt;/strong&gt;: Uses a dataset that is a mix of my data and the NutriBench data. &lt;a href="https://dspy.ai/api/optimizers/BootstrapFewShotWithRandomSearch/"&gt;BootstrapFewShotWithRandomSearch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MIPROv2&lt;/strong&gt;: Multiprompt Instruction PRoposal Optimizer Version 2, a prompt optimizer capable of optimizing both the instructions (e.g., the prompt) and the few shot examples. &lt;a href="https://dspy.ai/api/optimizers/MIPROv2/"&gt;MIPROv2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GEPA&lt;/strong&gt;: The new kid on the block for prompt optimizers. The nice thing about GEPA is that you can also give it some textual feedback on the incorrect predictions, and it will use it. &lt;a href="https://dspy.ai/api/optimizers/GEPA/overview/"&gt;GEPA&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As for models, I chose some that were cheap and fast - some closed (Gemini) and other Open (DeepSeek). I decided to test, &lt;code&gt;Gemini 2.5 Flash&lt;/code&gt; (what I am currently using in production), &lt;code&gt;DeepSeech v3.2&lt;/code&gt; with thinking on and off, and Google's new &lt;code&gt;Gemini 3 Flash&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Some thoughts on the results:
- The &lt;strong&gt;best performing model&lt;/strong&gt; is &lt;code&gt;Gemini 3 Flash&lt;/code&gt; with a set of 16 examples in the prompt. It achieves a score of around 60%. Similar to NutriBench, although the problems are slightly different. &lt;a href="https://gist.github.com/duarteocarmo/99df3327c918daa5b68c97572a3e0ee6#file-example-prediction-best-performing-prompt"&gt;Here's an example prediction&lt;/a&gt;.
- The &lt;strong&gt;GEPA optimization&lt;/strong&gt; came up &lt;a href="https://gist.github.com/duarteocarmo/99df3327c918daa5b68c97572a3e0ee6#file-gepa-example"&gt;with a prompt itself&lt;/a&gt;. When using that prompt with &lt;code&gt;Gemini 2.5 Flash&lt;/code&gt; it performs respectably well. However, the GEPA prompt actually fails to provide the correct response format when used with any other model. In other words, the prompt was &lt;strong&gt;overfit&lt;/strong&gt; to the model it was trained on. As a result, you only see one GEPA score in the results graph.
- The &lt;strong&gt;few-shot approach&lt;/strong&gt; is the most reliable one. It is model agnostic, performs well, and follows the styles of examples in a faithful manner.&lt;/p&gt;
&lt;p&gt;So, I decided to update Taralli to use &lt;code&gt;Gemini 3 Flash&lt;/code&gt; with the few-shot approach. This approach is ~15% more accurate when compared to the old version, which was running on &lt;code&gt;Gemini 2.5 Flash&lt;/code&gt;, with the exact same optimizer. In conclusion, &lt;strong&gt;all I changed was a model string&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Here's a snippet taken from the API. Since I'm using &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt;, I can also define our 2nd best performing model as a back up, just like so:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;LM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;"openrouter/google/gemini-3-flash-preview:nitro"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extra_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"reasoning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s2"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"deepseek/deepseek-v3.2:nitro"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="running-on-the-edge"&gt;Running on the edge&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/91/taralli_screenshots.png" alt="Taralli experiment results" style="max-width: 100%"&gt;&lt;p&gt;&lt;/p&gt;
&lt;figcaption&gt;Taralli's new features showcased&lt;/figcaption&gt;
&lt;p&gt;&lt;/p&gt;&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I've updated Taralli to use Apple's new Liquid Glass design for iOS 26. That was pretty simple. It's a 5 file SwitfUI app with around 4.5 MB. You can now also set periodic reminders so that you don't forget to track.&lt;/p&gt;
&lt;p&gt;One of the best new features is the ability to use Taralli &lt;em&gt;completely offline&lt;/em&gt; with an on-device LLM for nutritional analysis. The code snippet below transforms a DSPy optimized program into OpenAI-compatible messages. I created an endpoint that takes a &lt;code&gt;food_description&lt;/code&gt; and returns these messages for on-device inference. The iOS app calls this endpoint, receives the populated template, and uses it with the on-device model. As a backup, we can bundle the template directly in the app to skip the API call entirely. Long story short, Taralli now works on airplane (mode).&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nd"&gt;@lru_cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_classifier_template&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="n"&gt;program&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_classifier&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChatAdapter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;openai_messages_format&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
            &lt;span class="n"&gt;demos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;demos&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;{{&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;}}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;signature&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_fields&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# type: ignore&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;program&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_predictors&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}[&lt;/span&gt;&lt;span class="s2"&gt;"self"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;openai_messages_format&lt;/span&gt;
    &lt;span class="c1"&gt;# returns [{"role": "system", "content": feajfeafl}, {"role..}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="final-summary"&gt;Final summary&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hill climbing NutriBench:&lt;/strong&gt; The best performing model/prompt combo in my experiments hit a 60% Accuracy @ 10%. The task is slightly different than NutriBench - we also predict fat, protein, and carbs for example. However, I'm still interested in exploring how we could possibly hill climb that dataset. Could we get to 90%? Should we try a larger model? More thinking? Web browsing? How much of a difference would it make?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Training a model&lt;/strong&gt;: We could obviously also train a model on this task. The prompt optimizations will only get you so far, but it would also be very interesting to see how much a fine-tuned model would perform. To be completely honest, I still believe that in order to get the best performance possible, we would probably need some sort of connection to the outside world (web search, RAG, or the likes).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Few shot wins:&lt;/strong&gt; From several experiments with Taralli, and a couple of other projects, something has become more and more obvious: few-shot often wins. Not only when it relates to accuracy, but also specifying the correct format for the output that you are looking for. If you are dealing with a complicated task - try squeezing in some good examples of model behaviour.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vibes are useless:&lt;/strong&gt; You can't improve something you can't measure. Whenever someone tells me "the vibes are off," I call bullshit. Off? By how much? Can we put a number to it? Whenever you are improving a system, try to be scientific about it. Measure first, understand how the current system is performing, try to improve it, and then measure again.&lt;/li&gt;
&lt;/ul&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-1"&gt;For reference, this is on par with a Human nutritionist with internet access &lt;a href="#sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-2"&gt;I could have used more, but wanted to keep things fast &lt;a href="#sf-from-nutribench-to-taralli-how-far-can-you-take-a-prompt-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>Retrospectiva #2</title><link href="https://duarteocarmo.com/blog/retrospectiva-2.html" rel="alternate"/><published>2025-11-25T00:00:00+01:00</published><updated>2025-11-25T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-11-25:/blog/retrospectiva-2.html</id><summary type="html">&lt;p&gt;Big news in the state of Denmark. And no, &lt;a href="https://www.shakespeare-online.com/quickquotes/quickquotehamletdenmark.html"&gt;nothing's rotten&lt;/a&gt;. Allegra just came into the world.&lt;/p&gt;
&lt;p&gt;As she takes a nap, I take the opportunity to write November's Retrospectiva update.&lt;/p&gt;
&lt;p&gt;The most relevant thing this month is probably the release of my &lt;a href="https://duarteocarmo.com/blog/book-release-deepseek-in-practice"&gt;recent book about DeepSeek&lt;/a&gt;. It's nice to …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Big news in the state of Denmark. And no, &lt;a href="https://www.shakespeare-online.com/quickquotes/quickquotehamletdenmark.html"&gt;nothing's rotten&lt;/a&gt;. Allegra just came into the world.&lt;/p&gt;
&lt;p&gt;As she takes a nap, I take the opportunity to write November's Retrospectiva update.&lt;/p&gt;
&lt;p&gt;The most relevant thing this month is probably the release of my &lt;a href="https://duarteocarmo.com/blog/book-release-deepseek-in-practice"&gt;recent book about DeepSeek&lt;/a&gt;. It's nice to see it finally come to life. I've also been getting more regular in my running, hitting a nice weekly volume more consistently, it does get harder with cold weather. Given the recent events, I'm not sure it will last. In the midst of Vitamin D tablets and diapers, we'll do, as always, the best we can.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;In preparation for the baby arriving, I hunted for a pair of headphones I could wear during calls. It needed to do two things: have a microphone that blocks out any surrounding noises (you know), but at the same time allow me to be aware of my surroundings. I regularly wear the &lt;a href="https://shokz.com/pages/openrun"&gt;Open Run&lt;/a&gt; for running. I've enjoyed the bone conduction when I need to be aware of what's around me. I decided to take the plunge and get the &lt;a href="https://shokz.com/products/openmeet"&gt;OpenMeet&lt;/a&gt;'s. After a month of use, I definitely recommend them. They are niche though: they do the &lt;em&gt;exact opposite&lt;/em&gt; of cancelling the noise around you - so be aware of that.&lt;/p&gt;
&lt;p&gt;For years, I've been a fan of &lt;a href="https://jupyter.org/"&gt;Jupyter Notebooks&lt;/a&gt;. If it's more than a script, and I need to know what's going on with the data, notebooks are the way I go. After reading so much about it, I decided to finally try &lt;a href="https://marimo.io/"&gt;Marimo&lt;/a&gt; (&lt;a href="https://www.coreweave.com/news/coreweave-acquires-marimo-to-unify-the-generative-ai-developer-workflow"&gt;just acquired by Coreweave&lt;/a&gt;). The visuals and &lt;a href="https://docs.marimo.io/examples/"&gt;components&lt;/a&gt; are nice and extensive. The fact that any notebook is &lt;em&gt;just&lt;/em&gt; a uv script makes things &lt;em&gt;very&lt;/em&gt; reproducible. You can also transform notebooks into dashboards &lt;a href="https://docs.marimo.io/guides/deploying/programmatically/"&gt;and embed&lt;/a&gt; them into your FastAPI app which is great. But most of my notebooks don't need an AI panel, MCP connectors, interactive cells, and other bells and whistles. They need to show cell outputs at the bottom, a running order I can understand, Copilot, and VIM bindings. That's it.&lt;/p&gt;
&lt;p&gt;So for now, my go-to is still:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;--with&lt;span class="w"&gt; &lt;/span&gt;jupyter&lt;span class="w"&gt; &lt;/span&gt;--with&lt;span class="w"&gt; &lt;/span&gt;jupyter_copilot&lt;span class="w"&gt; &lt;/span&gt;--with&lt;span class="w"&gt; &lt;/span&gt;jupyterlab-vim&lt;span class="w"&gt; &lt;/span&gt;jupyter&lt;span class="w"&gt; &lt;/span&gt;lab
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;On a final note, I got completely rid of &lt;a href="www.dropbox.com"&gt;Dropbox&lt;/a&gt;. For years, it's been the tool of choice for my &lt;em&gt;important&lt;/em&gt; documents. They've been asking me to upgrade to a paid plan for years now to go beyond my 3GB limit(!). I've replaced it with self hosted &lt;a href="https://nextcloud.com/"&gt;NextCloud&lt;/a&gt; on my &lt;a href="https://duarteocarmo.com/blog/how-i-self-host-in-2024"&gt;Coolify&lt;/a&gt; instance. Now I have unlimited space, a lighter mac client, and exactly the same features. Without spending a single cent more.&lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;I finally took the plunge, and have gone back to reading some good old science fiction (the last thing I got seriously into was the &lt;a href="https://www.goodreads.com/series/117100-red-rising-saga"&gt;Red Rising &lt;/a&gt;series). I've been a John Scalzi fan for a long time, so I decided to pick up &lt;a href="https://www.goodreads.com/book/show/36510196-old-man-s-war"&gt;Old Man's War&lt;/a&gt;. I like Scalzi's straight, direct, and &lt;em&gt;no-bs&lt;/em&gt; writing style. Great for when I don't have the lights on and have to rely on my &lt;a href="https://duarteocarmo.com/blog/goodbye-kindle-i-dont-think-ill-miss-you"&gt;Palma&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm also diving deep into LLM architectures and pre-training this month. &lt;a href="https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1"&gt;&lt;em&gt;"FineWeb: decanting the web for the finest text data at scale"&lt;/em&gt;&lt;/a&gt;, was an insightful view into the world of creating datasets for pre-training. Full of gems. One of my favorites was the in-depth explanation of the &lt;a href="https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1"&gt;FineWeb-Edu classifier&lt;/a&gt;. The idea is simple: to train a classifier we can use to filter pre-training data in order to get the most "refined" pretraining dataset possible.&lt;/p&gt;
&lt;p&gt;Highly related to that, I ordered &lt;em&gt;&lt;a href="https://www.lulu.com/shop/nouamane-tazi-and-ferdinand-mom-and-haojun-zhao-and-phuc-nguyen/the-ultra-scale-playbook/paperback/product-45yk9dj.html?page=1&amp;amp;pageSize=4"&gt;"The Ultra-Scale Playbook: Training LLMs on GPU Clusters"&lt;/a&gt;&lt;/em&gt;. It’s the second book in this "trilogy" of blog posts (3rd one still being written). It starts off with some basic concepts like Data, Tensor, and Context Parallelism, but goes off to more advanced things like training configurations and GPU kernels. You can really feel the "war stories" some of the authors went through before writing it.&lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;I gave Apple Music a fair shot, but I had to go back to Spotify. Music is too important to have a search functionality that doesn't work. I don't want a single barrier between me and what I want to listen to.&lt;/p&gt;
&lt;p&gt;There are a lot of things wrong with Apple Music. But one good thing is the live radio and shows hosted on there. Particularly &lt;a href="https://music.apple.com/us/curator/classical-connections-radio-with-alexis-ffrench/1653477785"&gt;Classical Connections Radio&lt;/a&gt; which blends modern and classical music in a very original way. That show led me to the deep rabbit hole of &lt;a href="https://en.wikipedia.org/wiki/Duduk"&gt;Duduk&lt;/a&gt; based music, which is particularly useful for soothing a sleeping baby.&lt;/p&gt;
&lt;p&gt;Here's an amazing one by &lt;a href="https://en.wikipedia.org/wiki/Djivan_Gasparyan"&gt;Djivan Gasparyan&lt;/a&gt;:&lt;/p&gt;
&lt;iframe data-testid="embed-iframe" style="border-radius:12px" src="https://open.spotify.com/embed/track/74EHZDVuASkFGH5BSy9KPp?utm_source=generator&amp;theme=0" width="100%" height="152" frameBorder="0" allowfullscreen="" allow="autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture" loading="lazy"&gt;&lt;/iframe&gt;

&lt;h2 id="watching"&gt;Watching&lt;/h2&gt;
&lt;p&gt;We found some time to pull the old projector out. Lately, the focus has been Spanish thrillers (or as Italians call the genre: &lt;em&gt;giallo&lt;/em&gt; - i.e., yellow). The Netflix Spain originals based on &lt;a href="https://www.goodreads.com/author/show/8339062.Javier_Castillo"&gt;Javier Castillo's&lt;/a&gt; writing have been the binge target. First up was the &lt;a href="https://www.rottentomatoes.com/tv/the_crystal_cuckoo"&gt;Crystal Cuckoo&lt;/a&gt;, which was definitely entertaining. That sparked our interest, so we started &lt;a href="https://www.rottentomatoes.com/tv/the_snow_girl"&gt;"Chica de Nieve"&lt;/a&gt; (i.e., the Snow Girl), which has been even better so far.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;&lt;/p&gt;
&lt;iframe width="100%" height="315" src="https://www.youtube.com/embed/SPcE8J2Xwjc?si=c54oLGgCNVrSveDq&amp;amp;controls=0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen&gt;&lt;/iframe&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Book release: DeepSeek in Practice</title><link href="https://duarteocarmo.com/blog/book-release-deepseek-in-practice.html" rel="alternate"/><published>2025-11-17T00:00:00+01:00</published><updated>2025-11-17T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-11-17:/blog/book-release-deepseek-in-practice.html</id><summary type="html">&lt;p&gt;&lt;br&gt;
&lt;a href="https://www.packtpub.com/en-us/product/deepseek-in-practice-9781806020850" target="_blank"&gt;
&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/89/cover_image.webp" alt="DeepSeek in Practice book Packt" class="shadow" style="max-width: 40%"&gt;
&lt;/center&gt;
&lt;/a&gt;
&lt;br&gt;
Back in May this year, my longtime friend &lt;a href="https://mlops.systems/about.html"&gt;Alex&lt;/a&gt; reached out and asked me if I wanted to collaborate with him on a book about DeepSeek. I would love to tell you the story of how I thought long and hard before getting back to him. I didn't. I just …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;br&gt;
&lt;a href="https://www.packtpub.com/en-us/product/deepseek-in-practice-9781806020850" target="_blank"&gt;
&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/89/cover_image.webp" alt="DeepSeek in Practice book Packt" class="shadow" style="max-width: 40%"&gt;
&lt;/center&gt;
&lt;/a&gt;
&lt;br&gt;
Back in May this year, my longtime friend &lt;a href="https://mlops.systems/about.html"&gt;Alex&lt;/a&gt; reached out and asked me if I wanted to collaborate with him on a book about DeepSeek. I would love to tell you the story of how I thought long and hard before getting back to him. I didn't. I just said yes.&lt;/p&gt;
&lt;p&gt;Now, 6 months later, I'm excited to announce that &lt;a href="https://www.packtpub.com/en-us/product/deepseek-in-practice-9781806020850"&gt;&lt;em&gt;DeepSeek in Practice&lt;/em&gt;&lt;/a&gt; is out and available for pre-order! The book is published by &lt;a href="https://en.wikipedia.org/wiki/Packt"&gt;Packt&lt;/a&gt; and is a collaboration between &lt;a href="https://pengandy.com/"&gt;Andy Peng&lt;/a&gt;, &lt;a href="https://mlops.systems/about.html"&gt;Alex Strick van Linschoten&lt;/a&gt;, and me.&lt;/p&gt;
&lt;p&gt;As the title suggests, this book focuses on DeepSeek models, which &lt;a href="https://www.nytimes.com/2025/01/27/technology/what-is-deepseek-china-ai.html"&gt;famously took the world by storm earlier this year&lt;/a&gt;. It's organized into three parts: "Understanding and Exploring DeepSeek" examines their role in the wider LLM world. "Using DeepSeek" is entirely dedicated to applying DeepSeek models to real-world problems. Finally, "Distilling and Deploying DeepSeek," covers distillation and deployment.&lt;/p&gt;
&lt;p&gt;Even though the book was a joint effort, my main focus was on Part 2, and how to use DeepSeek models to tackle real-world problems. In particular two chapters:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chapter 5, &lt;em&gt;Building with DeepSeek&lt;/em&gt;,&lt;/strong&gt; walks through creating an alternative to Garmin's daily summary notifications. We go through building a prototype using DeepSeek's API, how to leverage local models, frameworks like &lt;a href="https://github.com/mlc-ai/xgrammar"&gt;XGrammars&lt;/a&gt; and &lt;a href="https://docs.vllm.ai/en/latest/"&gt;vLLM&lt;/a&gt;, all the way to deploying your own model using &lt;a href="https://docs.djl.ai/master/docs/serving/serving/docs/lmi/index.html"&gt;AWS Large Model Inference&lt;/a&gt; (LMI).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chapter 6, &lt;em&gt;Agents with DeepSeek&lt;/em&gt;,&lt;/strong&gt; is all about agents. We start with a short intro to agents, tools, and the inner workings of the &lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro"&gt;MCP protocol&lt;/a&gt;. After that, we build (from scratch) three different agents powered by DeepSeek models: an evaluator-optimizer that summarizes Arxiv papers, an orchestrator-worker that generates research reports, and a tool-calling agent that can search the web and answer complex questions.&lt;/p&gt;
&lt;p&gt;Overall, I'm really excited that this book finally gets to see the light of day. Contributing to a book is no small feat. The process is rough: lots of discussion and lots of drafts thrown in the trash. It was hard work, but also a lot of fun.&lt;/p&gt;
&lt;p&gt;I hope you enjoy it!&lt;/p&gt;
&lt;p&gt;Links:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.packtpub.com/en-us/product/deepseek-in-practice-9781806020850"&gt;Packt&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://a.co/d/8I139tG"&gt;Amazon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/PacktPublishing/DeepSeek-in-Practice"&gt;GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="blog"/></entry><entry><title>TTS still sucks</title><link href="https://duarteocarmo.com/blog/tts-still-sucks.html" rel="alternate"/><published>2025-11-10T00:00:00+01:00</published><updated>2025-11-10T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-11-10:/blog/tts-still-sucks.html</id><summary type="html">&lt;p&gt;or at least the open versions of it. I have this very stupid rule. A couple of years ago I decided to &lt;a href="https://duarteocarmo.com/blog/you-can-now-listen-to-this-blog"&gt;turn this blog into a podcast&lt;/a&gt;. At the time, I decided to make up a stupid rule: whatever model I use to clone my voice and generate article …&lt;/p&gt;</summary><content type="html">&lt;p&gt;or at least the open versions of it. I have this very stupid rule. A couple of years ago I decided to &lt;a href="https://duarteocarmo.com/blog/you-can-now-listen-to-this-blog"&gt;turn this blog into a podcast&lt;/a&gt;. At the time, I decided to make up a stupid rule: whatever model I use to clone my voice and generate article transcripts needs to be an open model.&lt;/p&gt;
&lt;p&gt;Why? Because - as you might have figured by now - I like to make my life hard. The last version of the podcast generation engine was running on &lt;a href="https://arxiv.org/abs/2410.06885"&gt;F5-TTS&lt;/a&gt;. It was &lt;em&gt;fine&lt;/em&gt;. I still got some funny messages from people showing me the model completely hallucinating or squeaking here and there. But a year later - I was pretty sure there would be something incredibly better out there.&lt;/p&gt;
&lt;p&gt;Now I’m not so sure.&lt;/p&gt;
&lt;p&gt;The first step was to look for the best TTS models out there. Thankfully, Artificial Analysis now publishes a leaderboard with the “best” &lt;a href="https://artificialanalysis.ai/text-to-speech/leaderboard"&gt;text-to-speech models&lt;/a&gt;. After filtering by my stupid rule of open models, we get the below ranking.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/88/tts-rankings.png" alt="tts rankings"&gt;&lt;/p&gt;
&lt;p&gt;At the top of the leaderboard is &lt;a href="https://huggingface.co/hexgrad/Kokoro-82M"&gt;Kokoro&lt;/a&gt;. Kokoro is an amazing model! Especially for a modest 82 Million (!) parameters and a mere 360 MB (!). However, like many models in this leaderboard - I can’t use it - since it doesn’t support voice cloning.&lt;/p&gt;
&lt;p&gt;I started by looking at some of the stuff from &lt;a href="https://fish.audio/"&gt;Fish Audio&lt;/a&gt;. Their &lt;a href="https://github.com/fishaudio/fish-speech"&gt;codebase&lt;/a&gt; seems to now support their new &lt;a href="https://huggingface.co/fishaudio/openaudio-s1-mini"&gt;S1-mini model&lt;/a&gt;. When testing it, most of the &lt;a href="https://github.com/fishaudio/fish-speech?tab=readme-ov-file#speech-control"&gt;emotion markers&lt;/a&gt; did not work - or were only available in their closed version. The breaks and long pauses either. Also, the &lt;a href="https://github.com/search?q=repo%3Afishaudio%2Ffish-speech%20chunk&amp;amp;type=code"&gt;chunking&lt;/a&gt; parameter is completely unused throughout the codebase - so not sure why it’s there. It’s a common business model nowadays: announce a state of the art open model just to attract attention to your real, and the incredible powerful gated model you have to pay for.&lt;/p&gt;
&lt;p&gt;My second-best option on the list was &lt;a href="https://github.com/resemble-ai/chatterbox"&gt;Chatterbox&lt;/a&gt;. This wave of TTS models comes with major limitations. They're all restricted to short character counts - around 1,000–2,000 characters, sometimes even less. Ask them to generate anything longer, and the voice starts hallucinating or speeds up uncontrollably.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/coqui/XTTS-v2"&gt;XTTS-v2&lt;/a&gt;&lt;/strong&gt;
&lt;audio controls style="width: 75%; display: block;" preload="metadata"&gt;&lt;source src="https://r2.duarteocarmo.com/old/xtts_v2.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/SWivid/F5-TTS"&gt;F5-TTS&lt;/a&gt;&lt;/strong&gt;
&lt;audio controls style="width: 75%; display: block" preload="metadata"&gt;&lt;source src="https://r2.duarteocarmo.com/old/f5_tts.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/resemble-ai/chatterbox"&gt;Chatterbox&lt;/a&gt;&lt;/strong&gt; (latest version)
&lt;audio controls style="width: 75%; display: block" preload="metadata"&gt;&lt;source src="https://r2.duarteocarmo.com/old/chatterbox.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;&lt;/p&gt;
&lt;p&gt;The transcript generation process is straightforward. First, text gets extracted from my RSS feed and &lt;a href="https://www.notion.so/TTS-still-sucks-29caecef3e2c80bc81ddc9484cab8c59?pvs=21"&gt;pre-processed by an LLM&lt;/a&gt; to make it more "readable". The LLM generates a transcript, a short summary, and a list of links for the show notes. We then chunk the transcript and &lt;a href="https://github.com/duarteocarmo/podcaster/blob/138d64ad083c2e4355e5f92980762b2d9af7c133/src/podcaster/transcription.py#L52"&gt;fire that off to a bunch of parallel Modal containers&lt;/a&gt; where we run the Chatterbox TTS model. Once we get everything back, we stitch the wav files together, and &lt;em&gt;voilà&lt;/em&gt;! The episode is ready. The hosting is an S3 bucket. Really, that’s what you are paying your podcast host for - it’s a lucrative business!&lt;/p&gt;
&lt;iframe allow="autoplay *; encrypted-media *; fullscreen *; clipboard-write" frameborder="0" height="450" style="width:100%;max-width:660px;overflow:hidden;border-radius:10px;" sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-storage-access-by-user-activation allow-top-navigation-by-user-activation" src="https://embed.podcasts.apple.com/us/podcast/duarte-o-carmos-articles/id1719493997"&gt;&lt;/iframe&gt;

&lt;p&gt;I also made some improvements to the podcast generation side of things. First of all, the podcast is now also available on &lt;a href="https://open.spotify.com/show/0qIVAs1ZDWnpJOQeMo1OjY?si=995d03c12ce749bd"&gt;Spotify&lt;/a&gt;. Additionally, I fixed the show notes to now include nice clickable links in almost every podcast player. Looking at you Apple and your &lt;code&gt;CDATA&lt;/code&gt; &lt;a href="https://help.apple.com/itc/podcasts_connect/#/itcb54353390"&gt;requirements&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;Some thoughts on the Chatterbox model. It’s definitely better than F5-TTS. But there are however, some common annoyances with almost every open-source voice cloning model. The first is the limited duration of the generated speech. Anything over 1000 characters starts hallucinating. The second is lack of &lt;em&gt;control&lt;/em&gt;. Some models have &lt;a href="https://github.com/fishaudio/fish-speech?tab=readme-ov-file#speech-control"&gt;emotion tags&lt;/a&gt;, others have &lt;code&gt;&amp;lt;pause&amp;gt;&lt;/code&gt; indicators. But almost every single one of these has been massively unreliable. To the point where I am splitting my text in a sentence per line and shipping that off to the TTS model to make things a bit more reliable.&lt;/p&gt;
&lt;p&gt;So yes, from one side, TTS has come a long way. But when compared to &lt;a href="https://elevenlabs.io/voice-cloning"&gt;other&lt;/a&gt; &lt;a href="https://www.minimax.io/audio"&gt;proprietary&lt;/a&gt; &lt;a href="https://inworld.ai/tts"&gt;systems&lt;/a&gt;, TTS still sucks.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: The rss to podcast pipeline is open source and available if you want to re-use it in &lt;a href="https://github.com/duarteocarmo/podcaster"&gt;this GitHub repo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Retrospectiva #1</title><link href="https://duarteocarmo.com/blog/retrospectiva-1.html" rel="alternate"/><published>2025-10-31T00:00:00+01:00</published><updated>2025-10-31T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-10-31:/blog/retrospectiva-1.html</id><summary type="html">&lt;p&gt;Welcome to &lt;em&gt;Retrospectiva.&lt;/em&gt; Retrospectiva is a monthly update about what I’ve been up to. In the age of LLMs, I’ve heard many argue that it’s hard (and useless) to write anything at all anymore. When anyone can prompt a model and get some text from the magic …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Welcome to &lt;em&gt;Retrospectiva.&lt;/em&gt; Retrospectiva is a monthly update about what I’ve been up to. In the age of LLMs, I’ve heard many argue that it’s hard (and useless) to write anything at all anymore. When anyone can prompt a model and get some text from the magic box, what’s the point?&lt;/p&gt;
&lt;p&gt;I couldn’t disagree more. Creativity, personality, and &lt;em&gt;good taste&lt;/em&gt; have never been so important. If anything, I’m planning to write more.&lt;/p&gt;
&lt;p&gt;Winter has arrived in Denmark, and the sun is setting at around 4PM. It’s what we call the &lt;a href="https://www.oed.com/dictionary/hygge_n"&gt;hygge&lt;/a&gt; period. It’s dark, cold, and rainy most of the time - and so we make the most out of staying inside.&lt;/p&gt;
&lt;h2 id="using"&gt;Using&lt;/h2&gt;
&lt;p&gt;The endless chase for the perfect browser continues. I’ve been a long time Firefox user, until things started breaking, and I moved to Chrome. I hate Chrome. Don’t get me wrong - Chrome works 99% of the time - but I like taking Google products with a grain of salt.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://helium.computer/"&gt;Helium&lt;/a&gt; has been a breath of fresh air. The guts are the same as Chrome, but it’s de-googlified and completely &lt;a href="https://github.com/imputnet/helium"&gt;open-source&lt;/a&gt;. One Chrome feature I rely on is the built-in webpage translation. Chrome’s Google Translate’s integration does this better than anything else. But &lt;a href="https://github.com/FilipePS/Traduzir-paginas-web"&gt;this little alternative&lt;/a&gt; has worked great for me.&lt;/p&gt;
&lt;p&gt;It’s a fruitful period for tech! In the AI-tool space there’s &lt;a href="https://www.claude.com/product/claude-code"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://developers.openai.com/codex/cli/"&gt;Codex&lt;/a&gt;, &lt;a href="https://cursor.com/blog/2-0"&gt;Cursor&lt;/a&gt;, &lt;a href="https://cline.bot/"&gt;Cline&lt;/a&gt;, &lt;a href="https://github.com/features/copilot/cli"&gt;Copilot&lt;/a&gt;, and probably 54 others. We are the ones that get to benefit from every company fighting to get us to use their tools.&lt;/p&gt;
&lt;p&gt;When I’m asked about what I’m using, my answer has consistently been: “It changes every week”. Recently I’ve gone back to what I love: &lt;a href="https://neovim.io/"&gt;Neovim&lt;/a&gt;, &lt;a href="https://github.com/tmux/tmux/wiki"&gt;tmux&lt;/a&gt;, and a couple of CLI based agents. To pull everything together &lt;a href="https://github.com/folke/sidekick.nvim"&gt;Sidekick&lt;/a&gt; has been great.&lt;/p&gt;
&lt;h2 id="reading"&gt;Reading&lt;/h2&gt;
&lt;p&gt;For most of 2025, I’ve switched between &lt;a href="https://duarteocarmo.com/blog/goodbye-kindle-i-dont-think-ill-miss-you"&gt;digital&lt;/a&gt; and physical books. But living in a small Danish apartment means I don’t have a bedside light when reading at night. Vitto got me this little &lt;a href="https://www.igritin.com/products/gritin-9-led-rechargeable-book-light-for-reading-in-bed"&gt;20 USD light&lt;/a&gt; and I’ve been back on the physical book train.&lt;/p&gt;
&lt;p&gt;Two highlights. The first is &lt;a href="https://www.lulu.com/shop/sean-goedecke/software-engineering-after-the-vibe-shift/paperback/product-jew4ver.html?q=Sean+Goedecke&amp;amp;page=1&amp;amp;pageSize=4"&gt;Software Engineering after the Vibe Shift&lt;/a&gt; by Sean Goedecke. It’s a short but insightful book. Even though I don’t consider myself a Software Engineer (I’m more on the Machine Learning side of things) a considerable amount of ideas hit home:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;«&lt;em&gt;The best way to anticipate problems is to deploy early. In general, a helpful question to ask is &lt;strong&gt;can I ship this right now?&lt;/strong&gt; Not this week, not today: right this second. If not, what would have to change for me to be able to ship something?»&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I’m finishing up &lt;a href="https://www.manning.com/books/machine-learning-system-design"&gt;Machine Learning System Design&lt;/a&gt; by Valerii Babushkin and Arseny Kravchenko. It’s a great read so far, with lots of interesting fireside stories about building machine learning systems at scale. It’s funny how a simple binary classification can get complicated when you are building at scale.&lt;/p&gt;
&lt;p&gt;I need to get back to reading some fiction.&lt;/p&gt;
&lt;h2 id="listening"&gt;Listening&lt;/h2&gt;
&lt;p&gt;Vitto asked me to move off Spotify with her. This is not a political blog - she has her reasons. So I took the plunge. After all, it’s fairly easy to move &lt;a href="https://support.apple.com/en-us/118249"&gt;my playlists around&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Overall: mixed bag. I could get around all the Google Home limitations by running &lt;a href="https://github.com/philippe44/AirConnect"&gt;Airconnect&lt;/a&gt; on my Raspberry Pi, and I understand the focus of Apple Music vs. Spotify is much about quality vs. quantity. But I’m not sure I’m sold. The Apple Music app on the Mac is buggy, and the playlist diversity is just not there. I want “Chill Italian Rap” - Spotify has at least 45 playlist results - how come Apple has none? And don’t get me started &lt;a href="https://discussions.apple.com/thread/255774369?sortBy=rank"&gt;on this&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For podcasts, I have two recommendations. &lt;a href="https://www.dwarkesh.com/p/andrej-karpathy"&gt;Karpathy's interview with Dwarkesh&lt;/a&gt; is a must-listen. I found myself smiling regularly throughout this one. Another great listen was &lt;a href="https://newsletter.pragmaticengineer.com/p/python-go-rust-typescript-and-ai"&gt;Armin Ronacher’s interview&lt;/a&gt; on the Pragmatic Engineer.&lt;/p&gt;
&lt;h2 id="watching"&gt;Watching&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=Q0TpWitfxPk"&gt;This video from Hank Green about the state of the AI industry&lt;/a&gt; is a very interesting watch (and commentary on &lt;a href="https://archive.vn/V0dWR"&gt;this article&lt;/a&gt;). Sure, he’s funny and entertaining, &lt;a href="https://www.publico.pt/2025/10/08/opiniao/opiniao/ia-bolha-utilidade-2150129"&gt;but I also agree on a lot&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But not everything is about ML and AI - and since it's winter - we also took some time to get some new blankets and fire up the old projector. HBO's &lt;a href="https://www.rottentomatoes.com/tv/a_body_in_the_snow_the_trial_of_karen_read/s01"&gt;Trial of Karen Read&lt;/a&gt; wasn't the most &lt;em&gt;intellectual&lt;/em&gt; watch but it was certainly entertaining. Recommend that one for the “binge in a day” category.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Faísca: The modern LLM stack in a single script</title><link href="https://duarteocarmo.com/blog/faisca-the-modern-llm-stack-in-a-single-script.html" rel="alternate"/><published>2025-10-15T00:00:00+02:00</published><updated>2025-10-15T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-10-15:/blog/faisca-the-modern-llm-stack-in-a-single-script.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/86/cover.webp" alt="Flower Clouds - Odilon Redon" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#why-do-this"&gt;Why do this?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-small-dataset-of-news-headlines"&gt;A small dataset of news headlines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gpt2-in-pytorch"&gt;GPT2 in PyTorch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#pre-training-headlines-in-portuguese"&gt;Pre-training headlines in Portuguese&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#supervised-fine-tuning-sft-on-portuguese-from-portugal"&gt;Supervised fine-tuning (SFT) on Portuguese from Portugal&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#reinforcement-learning-grpo-for-sports-news"&gt;Reinforcement Learning (GRPO) for sports news&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#final-thoughts-acknowledgements"&gt;Final thoughts &amp;amp; Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="why-do-this"&gt;Why do this?&lt;/h2&gt;
&lt;p&gt;ML and AI are moving at an incredible pace. The amount of research coming out …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/86/cover.webp" alt="Flower Clouds - Odilon Redon" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#why-do-this"&gt;Why do this?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-small-dataset-of-news-headlines"&gt;A small dataset of news headlines&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gpt2-in-pytorch"&gt;GPT2 in PyTorch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#pre-training-headlines-in-portuguese"&gt;Pre-training headlines in Portuguese&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#supervised-fine-tuning-sft-on-portuguese-from-portugal"&gt;Supervised fine-tuning (SFT) on Portuguese from Portugal&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#reinforcement-learning-grpo-for-sports-news"&gt;Reinforcement Learning (GRPO) for sports news&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#final-thoughts-acknowledgements"&gt;Final thoughts &amp;amp; Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="why-do-this"&gt;Why do this?&lt;/h2&gt;
&lt;p&gt;ML and AI are moving at an incredible pace. The amount of research coming out vastly surpasses anyone's ability to &lt;em&gt;interiorize&lt;/em&gt; it. By interiorize I mean study, experiment, or even just test it out. However, the importance of learning has never been greater. Even if we don't really know what the future looks like, time studying, reading, and building is &lt;em&gt;never&lt;/em&gt; wasted.&lt;/p&gt;
&lt;p&gt;I've always preferred pragmatic resources over theoretical ones. I learn best by doing. Resources like Andrej's &lt;a href="https://github.com/karpathy/minGPT"&gt;minGPT&lt;/a&gt; and Raschka's &lt;a href="https://github.com/rasbt/LLMs-from-scratch"&gt;LLM from scratch series&lt;/a&gt; are the ones I learn best from. So, I decided to build my own minimal implementation of an LLM that captures most of the concepts around "modern" training. It's called: &lt;a href="https://github.com/duarteocarmo/faisca"&gt;Faísca&lt;/a&gt;&lt;sup id="sf-faisca-the-modern-llm-stack-in-a-single-script-1-back"&gt;&lt;a href="#sf-faisca-the-modern-llm-stack-in-a-single-script-1" class="simple-footnote" title="Means spark in Portuguese"&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Faísca is an implementation of the "modern" LLM stack in a single script of around ~1000 lines of code. All you need to start training is &lt;a href="https://docs.astral.sh/uv/getting-started/installation/"&gt;uv&lt;/a&gt;:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;uv&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;https://raw.githubusercontent.com/duarteocarmo/faisca/refs/heads/master/faisca_torch.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I built it as an educational resource I can hack on over time. Right now, it trains a model that generates newspaper headlines in Portuguese. Simple? Absolutely. It's by no means comparable to ChatGPT. Faísca has &lt;em&gt;only&lt;/em&gt; 13 million parameters. (For context, GLM 4.6 -  the best performing open source model in LMArena - has ~355 billion&lt;sup id="sf-faisca-the-modern-llm-stack-in-a-single-script-2-back"&gt;&lt;a href="#sf-faisca-the-modern-llm-stack-in-a-single-script-2" class="simple-footnote" title="With only 32 active but still"&gt;2&lt;/a&gt;&lt;/sup&gt;: Faísca is &lt;strong&gt;~27.000 times smaller&lt;/strong&gt;)&lt;/p&gt;
&lt;p&gt;If you just want the code, &lt;a href="https://github.com/duarteocarmo/faisca/blob/master/faisca_torch.py"&gt;it's here&lt;/a&gt;. For the rest, let me walk you through how it works.&lt;/p&gt;
&lt;h2 id="a-small-dataset-of-news-headlines"&gt;A small dataset of news headlines&lt;/h2&gt;
&lt;p&gt;Faisca can work with any dataset. For this example, I started with something close to my heart: &lt;a href="/about"&gt;Portuguese&lt;/a&gt;. I built a &lt;a href="https://huggingface.co/datasets/duarteocarmo/ccnews-titles-2016"&gt;dataset&lt;/a&gt; that filters &lt;a href="https://commoncrawl.org/news-crawl"&gt;Common Crawl News&lt;/a&gt; for only 2016. The resulting &lt;a href="https://huggingface.co/datasets/duarteocarmo/ccnews-titles-2016"&gt;dataset&lt;/a&gt; has 1.8 million headlines from publications around the world.&lt;/p&gt;
&lt;p&gt;I also kept the language identified from Common Crawl as well as the original url.&lt;/p&gt;
&lt;p&gt;&lt;iframe src="https://huggingface.co/datasets/duarteocarmo/ccnews-titles-2016/embed/viewer/default/train" frameborder="0" width="100%" height="560px"&gt;&lt;/iframe&gt;
&lt;/p&gt;
&lt;h2 id="gpt2-in-pytorch"&gt;GPT2 in PyTorch&lt;/h2&gt;
&lt;p&gt;FaiscaGPT is inspired by both minGPT and LLM-from-scratch. It's a mix that implements the &lt;a href="https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf"&gt;GPT2&lt;/a&gt; architecture using only &lt;a href="https://pytorch.org/"&gt;PyTorch&lt;/a&gt; (the only dependency). At the core, you'll notice the Transformer Block with MultiHeadedAttention. In the vanilla state, the transformer block has 4 heads, an embedding dimension of 128, and 4 layers. This means the whole thing is around 13 million parameters (e.g., should run fine in modern MacBooks!)&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransformerBlock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;qkv_bias&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attention&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MultiheadAttention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;embed_dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;qkv_bias&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;batch_first&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feed_forward&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FeedForward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop_shortcut&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FaiscaGPT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vocab_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;positional_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transformer_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="n"&gt;TransformerBlock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;qkv_bias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;qkv_bias&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout_rate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_layers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;final_layer_norm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;out_head&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding_dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vocab_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;n_params_all&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="n"&gt;n_params_all_million&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n_params_all&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;1e6&lt;/span&gt;
        &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Total number of params: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;n_params_all_million&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.2f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;M"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If this is too big/small for you, you can tweak the settings in the &lt;a href="https://github.com/duarteocarmo/faisca/blob/master/faisca_torch.py#L979"&gt;config&lt;/a&gt;. I wanted something powerful, &lt;em&gt;mostly&lt;/em&gt; bug-free (I hope), and that isn't overwhelming for someone to tweak it.&lt;/p&gt;
&lt;p&gt;That's nice and all, but what about training?&lt;/p&gt;
&lt;h2 id="pre-training-headlines-in-portuguese"&gt;Pre-training headlines in Portuguese&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/86/faisca_2025-10-04_22-02-58.png" alt="faisca"&gt;&lt;/p&gt;
&lt;p&gt;The first phase is pre-training. We filter the dataset for headlines in the Portuguese language. We then train on around 30K headlines over 10 epochs. At the start the model outputs gibberish, eventually settling into something more coherent. Even if not grammatically correct, it certainly &lt;em&gt;looks&lt;/em&gt; plausible.&lt;/p&gt;
&lt;p&gt;If we wanted to make it speak perfect Portuguese, we would train it on more data for longer (following the &lt;a href="https://arxiv.org/abs/2203.15556"&gt;Chinchilla law&lt;/a&gt;, the optimal training data for such a model would be around 250M tokens). But that's not the goal of this exercise. However, I did notice a strong bias towards Brazilian Portuguese - expected, since it's the most common variant of Portuguese.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;...
Epoch: 8 - Step: 1540 - Train Loss: 3.7417 - Validation Loss: 4.0037 - Batch: 37 / 188 - Tokens seen: 25247744
Epoch: 8 - Step: 1560 - Train Loss: 3.7077 - Validation Loss: 4.0016 - Batch: 57 / 188 - Tokens seen: 25575424
Epoch: 8 - Step: 1580 - Train Loss: 3.7250 - Validation Loss: 3.9938 - Batch: 77 / 188 - Tokens seen: 25903104
Epoch: 8 - Step: 1600 - Train Loss: 3.7074 - Validation Loss: 3.9858 - Batch: 97 / 188 - Tokens seen: 26230784
Epoch: 8 - Step: 1620 - Train Loss: 3.7159 - Validation Loss: 3.9816 - Batch: 117 / 188 - Tokens seen: 26558464
Epoch: 8 - Step: 1640 - Train Loss: 3.7063 - Validation Loss: 3.9745 - Batch: 137 / 188 - Tokens seen: 26886144
Epoch: 8 - Step: 1660 - Train Loss: 3.6965 - Validation Loss: 3.9735 - Batch: 157 / 188 - Tokens seen: 27213824
Epoch: 8 - Step: 1680 - Train Loss: 3.6909 - Validation Loss: 3.9691 - Batch: 177 / 188 - Tokens seen: 27541504
**** GENERATION 1 OF 1 ****
&amp;gt; 'Presidente com o que os quase 3 milhões de seução'
&amp;gt; 'Governo para ao de fazer em São de casação deixa'
&amp;gt; 'Cânia de caminhão do ensino, diz que veítica de lider'
*************************
Epoch: 9 - Step: 1700 - Train Loss: 3.6883 - Validation Loss: 3.9643 - Batch: 9 / 188 - Tokens seen: 27869184
Epoch: 9 - Step: 1720 - Train Loss: 3.6987 - Validation Loss: 3.9607 - Batch: 29 / 188 - Tokens seen: 28196864
Epoch: 9 - Step: 1740 - Train Loss: 3.6507 - Validation Loss: 3.9561 - Batch: 49 / 188 - Tokens seen: 28524544
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;But how could we direct the model towards more of a Portuguese from Portugal? Enter our second training stage.&lt;/p&gt;
&lt;h2 id="supervised-fine-tuning-sft-on-portuguese-from-portugal"&gt;Supervised fine-tuning (SFT) on Portuguese from Portugal&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/86/faisca_2025-10-04_22-02-58_sft.png" alt="faisca sft"&gt;&lt;/p&gt;
&lt;p&gt;Supervised fine-tuning is &lt;em&gt;just&lt;/em&gt; more training. In the case of ChatGPT and variants this is where the models get on multi-turn conversations. In our case though, we want something slightly different.&lt;/p&gt;
&lt;p&gt;We re-filter the training data to include only headlines from websites that finish with &lt;code&gt;.pt&lt;/code&gt;, to make sure they're from Portugal. And we then train for about 5 epochs of 20K titles. And guess what? The effect is noticeable! After just a couple of epochs, the model's text feels much more like Portuguese from Portugal than Brazil (words like "Cristiano", "Portugal", the accents, and the tone).&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;...
Epoch: 3 - Step: 85 - Train Loss: 3.6417 - Validation Loss: 4.1149 - Batch: 14 / 24 - Tokens seen: 1409024
Epoch: 3 - Step: 90 - Train Loss: 3.6514 - Validation Loss: 4.1140 - Batch: 19 / 24 - Tokens seen: 1490944
Epoch: 3 - Step: 95 - Train Loss: 3.6601 - Validation Loss: 4.1088 - Batch: 24 / 24 - Tokens seen: 1572864
**** GENERATION 1 OF 1 ****
&amp;gt; 'Presidente-se ao de Festa'
&amp;gt; 'Actualidade: «Novação da Saúbal e um mortos'
&amp;gt; 'Actualidade: "Aumento deixação ao de atentado em Portugal'
&amp;gt; 'Actualidade: «Papa Francisco é'
*************************
Epoch: 4 - Step: 100 - Train Loss: 3.6414 - Validation Loss: 4.1060 - Batch: 5 / 24 - Tokens seen: 1654784
Epoch: 4 - Step: 105 - Train Loss: 3.6308 - Validation Loss: 4.1045 - Batch: 10 / 24 - Tokens seen: 1736704
Epoch: 4 - Step: 110 - Train Loss: 3.6047 - Validation Loss: 4.0994 - Batch: 15 / 24 - Tokens seen: 1818624
Epoch: 4 - Step: 115 - Train Loss: 3.6141 - Validation Loss: 4.0964 - Batch: 20 / 24 - Tokens seen: 1900544
**** GENERATION 1 OF 1 ****
&amp;gt; 'Presidente: «Quero e ao de mais devembia'
&amp;gt; 'Fernando de ano como de saíria'
&amp;gt; 'Actualidade: Pelo com ao de mais de Berlim foições'
&amp;gt; 'Cristiano Oriental: «Estamos'
*************************
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;What if we wanted to direct our model even more? What if all we wanted was football related headlines? Enter our final phase.&lt;/p&gt;
&lt;h2 id="reinforcement-learning-grpo-for-sports-news"&gt;Reinforcement Learning (GRPO) for sports news&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/86/faisca_2025-10-07_20-21-28_rl.png" alt="faisca rl"&gt;&lt;/p&gt;
&lt;p&gt;The last phase is all about &lt;a href="https://en.wikipedia.org/wiki/Reinforcement_learning"&gt;reinforcement learning&lt;/a&gt; (RL). The &lt;a href="https://openai.com/index/learning-from-human-preferences/"&gt;cherry on top of the cake&lt;/a&gt; that made ChatGPT so popular. A couple of years ago we needed to have some sort of &lt;a href="https://huggingface.co/datasets/openbmb/UltraFeedback"&gt;preference data&lt;/a&gt; to do reinforcement learning. In the last couple of years - as LLMs got more popular - other RL techniques came to fruition. Techniques like Proximal Policy Optimization (PPO), and Direct Preference Optimization (DPO) started popping up (&lt;a href="https://arxiv.org/abs/2404.10719"&gt;more reading here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Recently, &lt;a href="https://www.nature.com/articles/s41586-025-09422-z"&gt;DeepSeek took the world by storm&lt;/a&gt; and came up with Group Relative Policy Optimization (&lt;a href="https://arxiv.org/abs/2402.03300"&gt;GRPO&lt;/a&gt;). GRPO speeds up things compared to other RL techniques by removing the need for a critic model. Instead, we sample a group of candidate responses from our LLM, score them using a "reward function", and use those scores to update the model towards the "good responses". In short, we need much less memory to optimize.&lt;/p&gt;
&lt;p&gt;Another beautiful thing about this technique is that we only need a single function to optimize our model:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_reward_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;target_words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;"futebol"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"benfica"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"porto"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"sporting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"bola"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"liga"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;# ...&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;has_word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;intersection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_words&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;has_word&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That's it. That's enough to kick start the training process for our tiny model. In the graph above you see the rewards increase for every training step, as well as the KL divergence (that shows us how much the probability distribution shifts from the original).&lt;/p&gt;
&lt;p&gt;In practice, our model is now much more likely to generate headlines related to football - for which it was rewarded. It now mentions words like "Real Madrid" much more often!&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;...
3: kl= 0.2592, grad_norm= 0.7413
Processing experience 1 of 2
3: kl= 0.1728, grad_norm= 0.9851
Step 17 generated 24 completions
=== Sample completions ===
&amp;gt; Mais fé aos de lado pesso dização com o Governo que mais velho
&amp;gt; Mais o FC Porto de um luta contra o mundo aumento com o Sporting
&amp;gt; Mém o que faz apoi vivemio e agora do BES: «Quase um ligação do mundo de trabalho
&amp;gt; Sporting a todos os jogadores ao Real Madrid
=========================
Returns: 16.00/24
Returns of step 17: 16.0000
Processing experience 0 of 2
0: kl= 0.2157, grad_norm= 0.8532
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="final-thoughts-acknowledgements"&gt;Final thoughts &amp;amp; Acknowledgements&lt;/h2&gt;
&lt;p&gt;This was a pretty fun project to hack on, and took a longer than expected. I have some ideas for expanding it, and perhaps make a more &lt;em&gt;Apple-Native&lt;/em&gt; version by using &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX&lt;/a&gt;. Potentially, we could also create a &lt;a href="https://huggingface.co/blog/moe"&gt;Mixture of Experts&lt;/a&gt; variant.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/duarteocarmo/faisca"&gt;The repo is 100% open source&lt;/a&gt; - feel free to fork it, hack on it, and adapt it to your needs! If you find bugs, don't hesitate to submit a PR.&lt;/p&gt;
&lt;p&gt;Finally, I want to acknowledge the great projects that inspired this work:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/karpathy/minGPT"&gt;minGPT from Andrej Karpathy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/rasbt/LLMs-from-scratch"&gt;LLMs from Scratch - Sebastian Raschka&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/mingyin0312/RLFromScratch"&gt;RLFromScratch - Ming Yin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ttumiel/minRLHF"&gt;minRLHF - Tom Tumiel&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-faisca-the-modern-llm-stack-in-a-single-script-1"&gt;Means spark in Portuguese &lt;a href="#sf-faisca-the-modern-llm-stack-in-a-single-script-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-faisca-the-modern-llm-stack-in-a-single-script-2"&gt;With only 32 active but still &lt;a href="#sf-faisca-the-modern-llm-stack-in-a-single-script-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>AI: the utility bubble - for Publico.pt</title><link href="https://www-publico-pt.translate.goog/2025/10/08/opiniao/opiniao/ia-bolha-utilidade-2150129?_x_tr_sl=auto&amp;_x_tr_tl=en&amp;_x_tr_hl=en-US&amp;_x_tr_pto=wapp" rel="alternate"/><published>2025-10-09T00:00:00+02:00</published><updated>2025-10-09T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:www-publico-pt.translate.goog,2025-10-09:/2025/10/08/opiniao/opiniao/ia-bolha-utilidade-2150129</id><content type="html"/><category term="blog"/></entry><entry><title>Drowning in News</title><link href="https://duarteocarmo.com/blog/drowning-in-news.html" rel="alternate"/><published>2025-10-05T00:00:00+02:00</published><updated>2025-10-05T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-10-05:/blog/drowning-in-news.html</id><summary type="html">&lt;p&gt;The world moves fast, faster every day. For those who work with technology - and even those who don't - it's hard to keep up with the news. But I've always enjoyed staying up to date with what is happening, and my main tool to get it done hasn't changed in years …&lt;/p&gt;</summary><content type="html">&lt;p&gt;The world moves fast, faster every day. For those who work with technology - and even those who don't - it's hard to keep up with the news. But I've always enjoyed staying up to date with what is happening, and my main tool to get it done hasn't changed in years: it's &lt;a href="https://en.wikipedia.org/wiki/RSS"&gt;RSS&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My morning routine has been the same for the past 10 years. Every morning I open up &lt;a href="https://reederapp.com/classic/"&gt;Reeder&lt;/a&gt; on my Mac - or on my phone if I'm on the go - and ctrl+click the articles that interest me the most. The setup is not groundbreaking: a &lt;a href="https://feedly.com/i"&gt;Feedly&lt;/a&gt; account for my feeds, and Reeder as the app to consume them. It's not complicated. But something was bothering me.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/84/reeder.png" alt="reeder classic"&gt;&lt;/p&gt;
&lt;p&gt;RSS lets me stay in control of what's happening. With RSS, I'm not a victim of someone else's algorithm. I decide what gets in front of me and how. But I didn't have that feeling anymore. Messy categories, more and more feeds, non-working feeds, just a firehose of articles everyday. Scanning 100+ titles every morning is not a great way to start the day.&lt;/p&gt;
&lt;p&gt;Like most people nowadays, I bluntly decided to 'throw AI'&amp;reg; at the problem. I exported all my feeds from Feedly into &lt;code&gt;opml&lt;/code&gt; format and created a &lt;a href="https://gist.github.com/duarteocarmo/4869cae3f8c5bd5c95a556cc3a70ece3"&gt;quick script&lt;/a&gt; that uses LLMs to re-categorize all my feeds and reorganize them. That worked for a couple of weeks, but this was not a categorization problem. This was a &lt;em&gt;get back into control&lt;/em&gt; problem.&lt;/p&gt;
&lt;p&gt;I took the plunge and decided to self-host my RSS server. A couple of clicks on &lt;a href="/blog/how-i-self-host-in-2024.html"&gt;my Coolify instance&lt;/a&gt; and &lt;a href="https://freshrss.org/index.html"&gt;FreshRSS&lt;/a&gt; was up and running at &lt;a href="https://news.duarteocarmo.com"&gt;news.duarteocarmo.com&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My first impressions are great. First, it automatically detects dead or non-working RSS feeds. That led to a lot of 'Oh! I loved following this website' moments. So I spent half an hour cleaning those up. Also, managing the categories of different feeds is just much easier. There's a nice drag and drop view to put everything into place. But still, RSS is an uphill battle.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/84/freshrss.png" alt="freshrss"&gt;&lt;/p&gt;
&lt;p&gt;The world doesn't like RSS, they want you to go to the website and click those links. They'll ask for your email, they'll put up a paywall. Anything to get you to go to the website directly. But there's hope: here are some of my favorite tools to get everything back into your neatly organized RSS feed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What about newsletters?&lt;/strong&gt; &lt;a href="https://kill-the-newsletter.com"&gt;Kill the Newsletter&lt;/a&gt; is the solution. A great app from &lt;a href="https://leafac.com/"&gt;Leandro&lt;/a&gt; that turns any newsletter into an RSS feed you can subscribe to. Better? It's free. I would pay for it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What about Substack?&lt;/strong&gt; I know there's a lot of interesting content on &lt;a href="https://substack.com/"&gt;Substack&lt;/a&gt;, and many people I follow are pretty active there. Here's the good news: every Substack publication &lt;a href="http://support.substack.com/hc/en-us/articles/360038239391-Is-there-an-RSS-feed-for-my-publication"&gt;also offers an RSS feed&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What about Reddit?&lt;/strong&gt; That's a good question. For years, Reddit has been gradually phasing out the ability to follow subreddits via RSS. Throttling and failures have become the norm. &lt;a href="https://feedly.com/new-features/posts/follow-reddit-in-feedly"&gt;Feedly offers a solution for this&lt;/a&gt;, but it hasn't worked reliably for me. I want to follow &lt;em&gt;just the right amount&lt;/em&gt; of Reddit content. Fortunately, there's an excellent open-source project called &lt;a href="https://github.com/johnwarne/reddit-top-rss"&gt;Reddit Top RSS&lt;/a&gt;. I self-hosted it on Coolify (again, just 2 clicks) and now I have a &lt;em&gt;limited&lt;/em&gt; and &lt;em&gt;controlled&lt;/em&gt; feed for every subreddit I'm interested in.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I haven't really scratched the surface of what &lt;a href="https://freshrss.org/index.html"&gt;FreshRSS&lt;/a&gt; can do. Statistics, filtering feeds, extensions, labels, I haven't even begun exploring it all. And honestly? That's fine. My Reeder workflow remains unchanged, my morning routine intact. But underneath, something fundamental has shifted. Just like with my &lt;a href="/blog/hacking-on-my-finances-part-2-beancount-on-beanstalk.html"&gt;finances&lt;/a&gt;, self-hosting gives me something no third-party service can: control.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Common misconceptions about AI</title><link href="https://duarteocarmo.com/blog/common-misconceptions-about-ai.html" rel="alternate"/><published>2025-09-07T00:00:00+02:00</published><updated>2025-09-07T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-09-07:/blog/common-misconceptions-about-ai.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/83/gubbio_landscape.webp" alt="gubbio landscape" class="shadow"&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;It's that time of year again. As usual, we took a couple of weeks off and came south: a bit of Portugal, a bit of Italy, a lot of friends and family. My family has a long-running joke that I hate people and love my computer. That's not (entirely) true …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/83/gubbio_landscape.webp" alt="gubbio landscape" class="shadow"&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;It's that time of year again. As usual, we took a couple of weeks off and came south: a bit of Portugal, a bit of Italy, a lot of friends and family. My family has a long-running joke that I hate people and love my computer. That's not (entirely) true. I talked to lots of people this summer. But something felt off.&lt;/p&gt;
&lt;p&gt;ChatGPT this, AI that, Copilot here, Gemini there. Nobody had a clue what they were talking about. I’m not an AI guru, but I have worked in the field for some years. People are not stupid; people are misinformed.&lt;/p&gt;
&lt;p&gt;If you know what a Mixture-of-experts or KV cache is - some of these misconceptions might be surprising. If you don’t, then it’s time to clear some things up.&lt;/p&gt;
&lt;h2 id="large-language-models-are-connected-to-the-internet"&gt;Large Language Models are connected to the internet&lt;/h2&gt;
&lt;p&gt;Large language models (LLMs) are not connected to the internet. A model is a zip file, a file you can even run in your own laptop&lt;sup id="sf-common-misconceptions-about-ai-1-back"&gt;&lt;a href="#sf-common-misconceptions-about-ai-1" class="simple-footnote" title="You should try LM Studio"&gt;1&lt;/a&gt;&lt;/sup&gt;. A file that compresses the whole internet it was trained on. But that training had a beginning and an end. The model knows nothing about what happened after it was trained. Ask a model - in its purest form - the news from today and it will refuse or make something up.&lt;/p&gt;
&lt;p&gt;"But Duarte, when I ask ChatGPT today's news it knows". What ChatGPT - and many other apps - are doing is stuffing search results into the model context. ChatGPT used to have a Search button, remember? (I know my sister Cata was pissed I didn't tell her sooner). ChatGPT still has a "Search Web" button, it just now automatically detects if your question needs a web search and uses it. So you don’t have to think about it.&lt;/p&gt;
&lt;h2 id="large-language-models-should-cite-their-sources"&gt;Large Language Models should cite their sources&lt;/h2&gt;
&lt;p&gt;Let’s say you are training your own LLM. You start by collecting the entire internet into a single Word document. You run a probabilistic model over it. The model starts learning. For example, whenever the model sees “The president of the USA is,” it always sees “Barack Obama” next to it. This is a large document; it has seen that pattern many times—in websites, Wikipedia, blogs, books, etc. And so it learns it.&lt;/p&gt;
&lt;p&gt;Once training stops, you prompt the model: “The president of the USA is,” and it responds: “Barack Obama,” simply because it is &lt;em&gt;likely&lt;/em&gt;. It sounds wrong today, and that’s exactly the point: the model only knows what it saw during training. How can we cite this process? It wasn’t a single website that made the LLM respond like that. It was the effect of the entire corpus. But what exactly has it seen?—you might ask—and how much? Well, that’s where things get &lt;a href="https://www.nytimes.com/2025/09/05/technology/anthropic-settlement-copyright-ai.html"&gt;tricky&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="an-ice-cream-flavor-generated-by-artificial-intelligence"&gt;An Ice-cream flavor generated by artificial intelligence&lt;/h2&gt;
&lt;p&gt;There are a lot of things that can be generated by AI. Some of them good, some of them less bad, a lot of them useless. Just because you &lt;em&gt;can&lt;/em&gt; do something with AI does not mean you should. Someone prompted ChatGPT for weird flavors of ice-cream, posted them online and got thousands of likes. We have a name for that. It’s called &lt;a href="https://en.wikipedia.org/wiki/AI_slop"&gt;slop&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;$$
\text{AI Generated} \neq \text{Good}
$$&lt;/p&gt;
&lt;p&gt;Half our feeds are low quality AI content, the other is filled with influencers that &lt;em&gt;know&lt;/em&gt; that adding the two magic letters will triple the interactions. In summary: be skeptical of anything with “AI” slapped on it.&lt;/p&gt;
&lt;h2 id="it-automatically-learns-from-what-you-tell-it"&gt;It automatically learns from what you tell it&lt;/h2&gt;
&lt;p&gt;My entire family thinks AI is some sort of all powerful monster that learns continuously from whatever they tell it. Another myth. A model is an artifact that is stuck in time. In order for it to learn something new, you need to retrain it. Retraining is a long and expensive process. If you tell ChatGPT that your mom's name is Elsa, and create another conversation, the &lt;em&gt;model&lt;/em&gt; itself does &lt;em&gt;not&lt;/em&gt; know that. &lt;sup id="sf-common-misconceptions-about-ai-2-back"&gt;&lt;a href="#sf-common-misconceptions-about-ai-2" class="simple-footnote" title="Even if they use your data for training - but this loop takes a while - it's not instant"&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;I know if feels like it's continuously learning. AI providers are use small "tricks" to make that happen.They write what you said to it in a file, and then inject that file into the context when you start a conversation. Effectively giving you the idea it's learning even though it’s not. The model needs to be retrained to get new knowledge.&lt;/p&gt;
&lt;p&gt;&lt;br&gt;
&lt;img src="https://duarteocarmo.com/images/83/roma.webp" alt="Roma pizza" class="shadow"&gt;
&lt;br&gt;&lt;/p&gt;
&lt;h2 id="are-we-in-a-bubble"&gt;Are we in a bubble?&lt;/h2&gt;
&lt;p&gt;Of course we are, even &lt;a href="https://www.cnbc.com/2025/08/18/openai-sam-altman-warns-ai-market-is-in-a-bubble.html"&gt;Sam has said it&lt;/a&gt;. Many companies are pivoting overnight to AI, and getting crazy investments as a byproduct, much like the dot-com days. Researchers are getting paid &lt;a href="https://www.nytimes.com/2025/07/31/technology/ai-researchers-nba-stars.html"&gt;millions of dollars&lt;/a&gt;. Influencers are getting crazy engagement just by mentioning AI. This is peak stupid.&lt;/p&gt;
&lt;p&gt;But AI is also insanely useful. That has real effects. Just look at your everyday now: Are you using Google as much? Are you taking advantage of models in any way? I am pretty sure you are. These models are useful. These models are powerful. And the open source ecosystem of models is genuinely growing and exciting. The whole world in your pocket? The whole internet’s knowledge without WiFi!&lt;/p&gt;
&lt;p&gt;As usual, we are somewhere in the middle of the useful and the overhyped.&lt;/p&gt;
&lt;h2 id="humanity-is-doomed-and-technology-sucks"&gt;Humanity is doomed and technology sucks&lt;/h2&gt;
&lt;p&gt;Why are we investing money in going to Mars? Why are we inventing these new things that destroy creativity? Why are we submitting our children to such a future? We should go back to pen and paper, we should stop this right now, we should invest only in our planet and nothing else. I understand, change is scary, we are scared of the unknown. Especially when we are older and used to things in their &lt;em&gt;normal&lt;/em&gt; state - whatever that is.&lt;/p&gt;
&lt;p&gt;In a lot of ways though, it has never been more interesting to be alive. Code is easier, learning is easier, diagnosing is easier, boring things are easier. That is progress. But using all of those easier things to build something stupid has also never been easier. And that feels like the opposite of progress.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;I'm an optimist. Where a lot of people see doom, I see something interesting. That’s why I am in love with technology. When changes are this big then fear, uncertainty, and doubt prevail. Keep two things in mind: it’s relevant to understand how AI works to some extent, and  remember that &lt;a href="https://calnewport.com/no-one-knows-anything-about-ai/"&gt;no one knows anything.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;When the dust has settled another thing will become clear: taste, creativity, and the will of doing interesting things has never been more important. If I used ChatGPT to write this entire post, it would probably suck. But if I use ChatGPT strategically to give me feedback on how to make it better - it probably will.&lt;/p&gt;
&lt;p&gt;Like Vitto says: There are many fun things to do: go do them.&lt;/p&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-common-misconceptions-about-ai-1"&gt;You should try &lt;a href="https://lmstudio.ai/"&gt;LM Studio&lt;/a&gt; &lt;a href="#sf-common-misconceptions-about-ai-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-common-misconceptions-about-ai-2"&gt;Even if they &lt;a href="https://privacy.anthropic.com/en/articles/10023580-is-my-data-used-for-model-training"&gt;use your data&lt;/a&gt; for training - but this loop takes a while - it's not instant &lt;a href="#sf-common-misconceptions-about-ai-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>A Benchmark for language models on European Portuguese</title><link href="https://duarteocarmo.com/blog/a-benchmark-for-language-models-on-european-portuguese.html" rel="alternate"/><published>2025-07-21T00:00:00+02:00</published><updated>2025-07-21T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-07-21:/blog/a-benchmark-for-language-models-on-european-portuguese.html</id><summary type="html">&lt;div class="iframe-container"&gt;
  &lt;iframe scrolling="no" id="euroeval-frame" src="https://duarteocarmo.com/html/pt-euroeval.html" title="EuroEval European Portuguese Benchmarks" loading="lazy"&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;style&gt;
  .iframe-container {
    width: 100%;
    max-width: 800px;
    margin: 2rem auto;
    text-align: center;
  }

  #euroeval-frame {
    width: 100%;
    height: 500px;
    border: none;
    display: block;
    box-sizing: border-box;
  }

  @media (max-width: 768px) {
    #euroeval-frame {
      height: 450px;
    }
  }

  @media (max-width: 480px) {
    #euroeval-frame {
      height: 350px;
    }
  }
&lt;/style&gt;

&lt;p&gt;A couple of weeks ago in Lisbon, I went to a friend's birthday dinner. In front of …&lt;/p&gt;</summary><content type="html">&lt;div class="iframe-container"&gt;
  &lt;iframe scrolling="no" id="euroeval-frame" src="https://duarteocarmo.com/html/pt-euroeval.html" title="EuroEval European Portuguese Benchmarks" loading="lazy"&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;style&gt;
  .iframe-container {
    width: 100%;
    max-width: 800px;
    margin: 2rem auto;
    text-align: center;
  }

  #euroeval-frame {
    width: 100%;
    height: 500px;
    border: none;
    display: block;
    box-sizing: border-box;
  }

  @media (max-width: 768px) {
    #euroeval-frame {
      height: 450px;
    }
  }

  @media (max-width: 480px) {
    #euroeval-frame {
      height: 350px;
    }
  }
&lt;/style&gt;

&lt;p&gt;A couple of weeks ago in Lisbon, I went to a friend's birthday dinner. In front of me sat someone that recently started working for the Portuguese government where they focus on modernization and technology. It's not everyday that I talk to someone that works for the Portuguese government in an area similar to mine, so I was very curious. I asked about the &lt;a href="https://www.it.pt/News/NewsPost/5065"&gt;AMÁLIA&lt;/a&gt; project. The 5.5 Million Euro project about creating a new LLM &lt;em&gt;specifically&lt;/em&gt; designed for Portuguese.&lt;/p&gt;
&lt;p&gt;The first beta release of AMÁLIA is scheduled for the &lt;a href="https://www.portugal.gov.pt/pt/gc24/comunicacao/noticia?i=modelo-de-linguagem-em-grande-escala-para-a-lingua-portuguesa"&gt;first trimester of 2025&lt;/a&gt;, but I've heard little news about it. Still I asked: "Are you pre-training it? Or are you fine-tuning something that is already out there?". "No, we're training from scratch." the person told me. "Really, what's the point of that?" I asked, but did not get a clear reply. But the question stuck in my mind. What's the goal? To show Portugal is capable of pre-training a model from scratch? To build a model that is &lt;em&gt;specifically&lt;/em&gt; really good at Portuguese? &lt;sup id="sf-a-benchmark-for-language-models-on-european-portuguese-1-back"&gt;&lt;a href="#sf-a-benchmark-for-language-models-on-european-portuguese-1" class="simple-footnote" title="Other countries like Italy have done this - See Minerva. They also spend 5M, but from EU funds afaik"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id="the-problem-shooting-in-the-dark"&gt;The problem: Shooting in the dark&lt;/h2&gt;
&lt;p&gt;Let's assume the goal is to build a model that is &lt;em&gt;really good&lt;/em&gt; at Portuguese, and that the Portuguese state is really not interested in "showing we are capable of training models" (we have &lt;a href="https://www.theguardian.com/commentisfree/2025/jun/25/lisbon-europe-portugal-golden-visa-capital-investors-short-term-rentals"&gt;bigger problems&lt;/a&gt;). If we assume that, the first thing I would do would actually be to &lt;strong&gt;measure how good language models are at speaking Portuguese&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Portuguese is a popular language, it's the &lt;strong&gt;8th&lt;/strong&gt; most spoken language in the world. But it comes in &lt;a href="https://en.wikipedia.org/wiki/Portuguese-speaking_world"&gt;flavors&lt;/a&gt;. Portuguese spoken in Brazil (82% of native speakers) is quite different from Portuguese from Portugal (4% of speakers). A common complaint from people in Portugal is that the models often reply in "Brazilian" Portuguese. The difference is not only about the pronunciation: a lot of the vocabulary, verbs, and conjugations are different.&lt;/p&gt;
&lt;p&gt;Let's take Llama 3. It &lt;a href="https://scontent-cph2-1.xx.fbcdn.net/v/t39.2365-6/468347782_9231729823505907_4580471254289036098_n.pdf?_nc_cat=110&amp;amp;ccb=1-7&amp;amp;_nc_sid=3c67a6&amp;amp;_nc_ohc=MN9Qsqv_WlwQ7kNvwFzR-jm&amp;amp;_nc_oc=Adl-itXHUl8EqL_TJpicf2-H5wTmlDZO7zwWJkSwPs1eFM7cXlQzA1ddUZSIonQnPxI&amp;amp;_nc_zt=14&amp;amp;_nc_ht=scontent-cph2-1.xx&amp;amp;_nc_gid=r6Z1jsaqV629vM4DSY2Iag&amp;amp;oh=00_AfQPdfe0Ubcu4_kec6ttGpTvvtA2MNJeEJZ7_xzT3SJMFg&amp;amp;oe=6882A3C0"&gt;was trained&lt;/a&gt; on 8% multilingual data. If we assume 2% of that was Portuguese, and if we assume 5% of that is European Portuguese, then only &lt;strong&gt;0.008%&lt;/strong&gt; of the data Llama saw was European Portuguese. That's not a lot. But under representation is a common issue.&lt;/p&gt;
&lt;h2 id="european-portuguese-on-euroeval"&gt;European Portuguese on EuroEval&lt;/h2&gt;
&lt;p&gt;Denmark has a similar problem, and I knew Dan Nielsen at &lt;a href="https://alexandra.dk/"&gt;The Alexandra Institute&lt;/a&gt; worked on something called &lt;a href="https://arxiv.org/pdf/2304.00906"&gt;ScandEval&lt;/a&gt; where he evaluated the performance of language models across different Scandinavian languages. I was surprised to see the project evolved into something more general: &lt;a href="https://euroeval.com/"&gt;EuroEval&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;EuroEval, is similar to ScandEval - but for all European languages. And you want to guess the one that &lt;em&gt;was&lt;/em&gt; missing? Yes, Portuguese.&lt;/p&gt;
&lt;p&gt;Over a couple of weeks, &lt;a href="https://github.com/EuroEval/EuroEval/issues/1040"&gt;we put together&lt;/a&gt;&lt;sup id="sf-a-benchmark-for-language-models-on-european-portuguese-2-back"&gt;&lt;a href="#sf-a-benchmark-for-language-models-on-european-portuguese-2" class="simple-footnote" title="Thanks Dan for all the help in making this happen!"&gt;2&lt;/a&gt;&lt;/sup&gt; a collection of datasets to evaluate the performance of language models in in European Portuguese. We also did some extra work to ensure that the data is &lt;em&gt;exclusively&lt;/em&gt; Portuguese from Portugal.&lt;/p&gt;
&lt;p&gt;Here are the datasets we put together:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sentiment Classification&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/sst2-pt-mini"&gt;SST2-PT&lt;/a&gt;): Part of the work from the &lt;a href="https://arxiv.org/abs/2404.05333"&gt;ExtraGLUE&lt;/a&gt; project. A sentiment analysis dataset built using machine translation (DeepL).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Named Entity Recognition&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/harem"&gt;HAREM&lt;/a&gt;): Part of the work from the &lt;a href="https://www.linguateca.pt/harem/"&gt;HAREM project&lt;/a&gt;. We filter by entries where the origin is PT - to create an NER dataset.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Linguistic Acceptability&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/scala-pt"&gt;ScaLA-pt&lt;/a&gt;): Based on &lt;a href="https://universaldependencies.org/treebanks/pt_bosque/index.html"&gt;Portuguese-Bosque treebank&lt;/a&gt;, filtered by entries from &lt;a href="https://www.linguateca.pt/cetempublico/"&gt;CETEMPúblico&lt;/a&gt;. Created by corrupting grammatically correct sentences.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reading Comprehension&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/boolq-pt"&gt;BoolQ-PT&lt;/a&gt;): Also part of the &lt;a href="https://arxiv.org/abs/2404.05333"&gt;ExtraGlue&lt;/a&gt; work. Adapted by taking the original passage, question, and yes/no options, and turning it into a Q/A style question where the model can answer yes or no.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Knowledge&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/mmlu-pt-mini"&gt;MMLU-pt&lt;/a&gt;): Based on &lt;a href="https://arxiv.org/abs/2410.08928"&gt;this paper&lt;/a&gt;. Already included entries specifically for Portuguese from Portugal.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Common-sense Reasoning&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/goldenswag-pt-mini"&gt;GoldenSwag-pt&lt;/a&gt;): High quality filtered samples from the &lt;a href="https://aclanthology.org/P19-1472/"&gt;HellaSwag dataset&lt;/a&gt;. Also machine translated with DeepL for European Portuguese.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summarization&lt;/strong&gt; (&lt;a href="https://huggingface.co/datasets/duarteocarmo/publico-mini"&gt;Publico&lt;/a&gt;): Filtered the &lt;a href="https://commoncrawl.org/blog/news-dataset-available"&gt;CCNews corpus&lt;/a&gt; for entries where the url matched &lt;a href="https://www.publico.pt/"&gt;Público&lt;/a&gt;. Transformed into summarization dataset by extracting the first two sentences as the summary (a common trick).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I you want to look at some samples, &lt;a href="https://huggingface.co/duarteocarmo"&gt;I published the datasets on HuggingFace&lt;/a&gt;. You can also read the extensive descriptions on the &lt;a href="https://euroeval.com/datasets/portuguese/"&gt;EuroEval docs&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="running-benchmarks-for-european-portuguese"&gt;Running benchmarks for European Portuguese&lt;/h2&gt;
&lt;p&gt;While Dan is working on the general leaderboard, I ran some benchmarks on my own which you saw on top of this blog post or &lt;a href="/html/pt-euroeval.html"&gt;in this link&lt;/a&gt;. I selected some models I was curious about within 3 "categories": Large, Small, and things I can run on my laptop without the fan coming on.&lt;/p&gt;
&lt;p&gt;If you're curious about how a particular model performs that I (or Dan) didn't benchmark, you can also run them yourself (assuming you have &lt;a href="https://docs.astral.sh/uv/getting-started/installation/"&gt;uv&lt;/a&gt; installed):&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;uvx&lt;span class="w"&gt; &lt;/span&gt;--with&lt;span class="w"&gt; &lt;/span&gt;euroeval&lt;span class="w"&gt; &lt;/span&gt;euroeval&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;--model&lt;span class="w"&gt; &lt;/span&gt;ollama_chat/smollm2:135m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;--task&lt;span class="w"&gt; &lt;/span&gt;sentiment-classification&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;--language&lt;span class="w"&gt; &lt;/span&gt;pt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Set the &lt;code&gt;--model&lt;/code&gt; flag to any model that &lt;a href="https://docs.litellm.ai/docs/providers"&gt;LiteLLM&lt;/a&gt; supports.&lt;/p&gt;
&lt;p&gt;The reality of building these benchmarks is a clear realization: European Portuguese is an unpopular language. Brazilian Portuguese is simply a much more popular version of the language. Still, there is value in building benchmarks focused on the European variant. And it was a lot of fun tracking/building these datasets. I expect to do some more work in this area.&lt;/p&gt;
&lt;p&gt;And I &lt;em&gt;don't&lt;/em&gt; know if investing 5.5M Euro in developing a Portuguese Language Model is a good idea. But there's one thing I do know: Whenever that model comes out, we are now in a &lt;em&gt;much&lt;/em&gt; better position to say if it hit the mark. Or if it didn't.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;Note: The benchmarks above are preliminary. For the official ones keep an eye on &lt;a href="https://euroeval.com/leaderboards/"&gt;EuroEval Leaderboards&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-a-benchmark-for-language-models-on-european-portuguese-1"&gt;Other countries like Italy have done this - &lt;a href="https://minerva-ai.org/"&gt;See Minerva&lt;/a&gt;. They also spend 5M, &lt;a href="https://eurohpc-ju.europa.eu/advancing-ai-eurohpc-minerva-project-2025-02-13_en"&gt;but from EU funds&lt;/a&gt; afaik &lt;a href="#sf-a-benchmark-for-language-models-on-european-portuguese-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-a-benchmark-for-language-models-on-european-portuguese-2"&gt;Thanks &lt;a href="https://www.saattrupdan.com/"&gt;Dan&lt;/a&gt; for all the help in making this happen! &lt;a href="#sf-a-benchmark-for-language-models-on-european-portuguese-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>MCPs are mostly hype</title><link href="https://duarteocarmo.com/blog/mcps-are-mostly-hype.html" rel="alternate"/><published>2025-06-06T00:00:00+02:00</published><updated>2025-06-06T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-06-06:/blog/mcps-are-mostly-hype.html</id><summary type="html">&lt;p&gt;..but they can also be a lot of fun. If you work in tech, I'd say there's a 98% chance you've heard about it. MCPs are the future of agents, MCPs will be everywhere, MCPs are the future. The Model Context Protocol, first introduced by Anthropic is blowing up. For …&lt;/p&gt;</summary><content type="html">&lt;p&gt;..but they can also be a lot of fun. If you work in tech, I'd say there's a 98% chance you've heard about it. MCPs are the future of agents, MCPs will be everywhere, MCPs are the future. The Model Context Protocol, first introduced by Anthropic is blowing up. For both &lt;a href="https://modelcontextprotocol.io/introduction#why-mcp%3F"&gt;good&lt;/a&gt; and &lt;a href="https://simonwillison.net/2025/Apr/9/mcp-prompt-injection/#mixing-tools-with-untrusted-instructions-is-inherently-dangerous"&gt;bad&lt;/a&gt; reasons.&lt;/p&gt;
&lt;p&gt;For me, as always, I refrain from judgement before taking anything for a spin. At the end of the day, MCP is a standardized way of giving access to tools to an LLM. Those tools come in the form of function calling. So nothing groundbreaking.&lt;/p&gt;
&lt;p&gt;I've tested a few MCP servers. Adding the Gitlab MCP server to Zen, adding the Polars docs MCP to Claude, adding a TickTick (my todo list) MCP server to Bolt. They all worked &lt;em&gt;sometimes&lt;/em&gt;. And in computers, when something works sometimes, well - I prefer using something else.&lt;/p&gt;
&lt;p&gt;As the training for Oslo ramps up, I've once again felt the need to give an LLM access to "the outside world".&lt;/p&gt;
&lt;h3 id="accessing-my-training-textbooks"&gt;Accessing my training textbooks&lt;/h3&gt;
&lt;p&gt;For Oslo (and for the past 3 years) - I've been following Phitzinger's training methodology. In short, I believe mostly in volume. I have the pdf version of the book, it's great. Whenever I dropped the book into any of the LLM/Chat apps, I almost always got the same "context is full" or "too many tokens within X mins" message.&lt;/p&gt;
&lt;p&gt;I don't want the LLM to access the whole book. I want it to access some parts of the book. If your mind jumps straight to RAG, let me stop you right there. It's not RAG, it's search. We don't want a whole vector database. We don't want a whole retrieval system. We want keyword search, over my book. That's it.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;prep_book&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;BM25Okapi&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="n"&gt;script_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;
    &lt;span class="c1"&gt;# uvx --from &amp;quot;markitdown[pdf]&amp;quot; markitdown marathon.pdf &amp;gt; marathon.md (make it a md file)&lt;/span&gt;
    &lt;span class="n"&gt;book&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;script_dir&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;marathon.md&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;r&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;corpus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;chunk_size_words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;
    &lt;span class="n"&gt;overlap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size_words&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;chunk_size_words&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; &amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;chunk_size_words&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;

    &lt;span class="n"&gt;tokenized_corpus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;bm25&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BM25Okapi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenized_corpus&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bm25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_marathon_training_book&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Search the marathon training book for a specific query.&lt;/span&gt;
&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        query (str): The search query.&lt;/span&gt;
&lt;span class="sd"&gt;        total_results (int): The number of results to return. (default: 10)&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;bm25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prep_book&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;tokenized_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bm25&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_top_n&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenized_query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;total_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results_as_bullets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;lt;result_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/result_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;gt;&amp;quot;&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;search_marathon_training_book&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;search_query&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/search_query&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results_as_bullets&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;/search_marathon_training_book&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="accessing-my-training-plan"&gt;Accessing my training plan&lt;/h3&gt;
&lt;p&gt;Now that we have a function to access my training "philosophy", we also need the LLM to be able to access my training plan. I'm a simple guy, &lt;a href="https://defy.org/hacks/calendarhack"&gt;I like my training plan in my calendar&lt;/a&gt;. It's one less tool I need to use. Well - we can also give the LLM access to my calendar's &lt;code&gt;ics&lt;/code&gt; file. We can just call the &lt;code&gt;get_training_program_events&lt;/code&gt; and access the training events within a time range. Simple, effective.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_training_program_events&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Fetches the athlete&amp;#39;s training program events in the date range specified.&lt;/span&gt;

&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        start_date: Start date in format &amp;quot;DD-MM-YYYY&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;        end_date: End date in format &amp;quot;DD-MM-YYYY&amp;quot;&lt;/span&gt;

&lt;span class="sd"&gt;    Returns:&lt;/span&gt;
&lt;span class="sd"&gt;        A string with matching VEVENTS in XML-like format.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strptime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s2"&gt;-%m-%Y&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ZoneInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;UTC&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strptime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s2"&gt;-%m-%Y&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ZoneInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;UTC&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;ics_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;oslo_training_plan.ics&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;ics_content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ics_path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;cal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Calendar&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_ical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ics_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;events_xml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;combine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ZoneInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;UTC&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ZoneInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;UTC&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;component&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;walk&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;VEVENT&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;

        &lt;span class="n"&gt;dtstart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;DTSTART&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;dtend_raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;DTEND&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;dtend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decoded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;DTEND&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dtend_raw&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dtend&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;dtstart&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;dtend&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;

        &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;SUMMARY&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;component&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;LOCATION&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;events_xml&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&amp;lt;event&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;  &amp;lt;summary&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/summary&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;  &amp;lt;location&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/location&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;  &amp;lt;start&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dtstart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/start&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;  &amp;lt;end&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dtend&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/end&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;&amp;lt;/event&amp;gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;lt;events&amp;gt;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;events_xml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/events&amp;gt;&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="checking-my-runs"&gt;Checking my runs&lt;/h3&gt;
&lt;p&gt;The LLM also needs to be able to compare the planned training sessions to the actual runs I've made (you know - to keep me in check, or give me feedback). The easiest (NOT the safest) way to do this is to use the &lt;code&gt;garminconnect&lt;/code&gt; library. We can also wrap that up in a simple function:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_athlete_runs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lookback_days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Fetch athlete&amp;#39;s running activities from Garmin Connect.&lt;/span&gt;
&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        lookback_days (int): Number of days to look back for running activities.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;garmin&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;start_garmin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;today&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;startdate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lookback_days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;activities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;garmin&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_activities_by_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;startdate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;activitytype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;running&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sorted_runs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;activities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;startTimeLocal&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;lt;Runs&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sorted_runs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;startTimeLocal&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;distance_km&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;distance&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;duration_sec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;duration&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;duration_fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;duration_sec&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="n"&gt;avg_hr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;averageHR&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;n/a&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pace_sec_per_km&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;duration_sec&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;distance_km&lt;/span&gt;
        &lt;span class="n"&gt;pace_min&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pace_sec_per_km&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pace_sec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pace_sec_per_km&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pace_min&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pace_sec&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;02d&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; min/km&amp;quot;&lt;/span&gt;

        &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;  &amp;lt;Run&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;Date&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/Date&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;Distance&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;distance_km&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; km&amp;lt;/Distance&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;Duration&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;duration_fmt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/Duration&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;Pace&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pace&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;lt;/Pace&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;lt;HeartRate&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;avg_hr&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; bpm&amp;lt;/HeartRate&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;  &amp;lt;/Run&amp;gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;lt;/Runs&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h3 id="putting-it-all-together"&gt;Putting it all together&lt;/h3&gt;
&lt;p&gt;Our goal would be to give an LLM access to all these tools at the same time, and let it analyze and give me feedback on my training. MCP is supposed to solve just that - putting all your tools in a single place. This is the moment where we would start developing our complicated 12-factor MCP architecture. But instead of doing that, why don't we just use uv scripts?&lt;/p&gt;
&lt;p&gt;The only thing to do, is to drop all the functions we want to give an LLM in a single file. With that in place, we can run &lt;code&gt;uv run oslo_marathon_mcp.py&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# /// script&lt;/span&gt;

&lt;span class="c1"&gt;# requires-python = &amp;quot;&amp;gt;=3.12&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# dependencies = [&lt;/span&gt;

&lt;span class="c1"&gt;#     &amp;quot;mcp[cli]&amp;quot;,&lt;/span&gt;

&lt;span class="c1"&gt;#     &amp;quot;garminconnect&amp;quot;,&lt;/span&gt;

&lt;span class="c1"&gt;#     &amp;quot;rank-bm25&amp;quot;,&lt;/span&gt;

&lt;span class="c1"&gt;#     &amp;quot;icalendar&amp;quot;,&lt;/span&gt;

&lt;span class="c1"&gt;# ]&lt;/span&gt;

&lt;span class="c1"&gt;# ///&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;rank_bm25&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BM25Okapi&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;garminconnect&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Garmin&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;icalendar&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Calendar&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;zoneinfo&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ZoneInfo&lt;/span&gt;

&lt;span class="n"&gt;MCP&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;oslo_marathon&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@MCP&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_marathon_training_book&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Search the marathon training book for a specific query.&lt;/span&gt;
&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        query (str): The search query.&lt;/span&gt;
&lt;span class="sd"&gt;        total_results (int): The number of results to return. (default: 10)&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@MCP&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_athlete_runs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lookback_days&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Fetch athlete&amp;#39;s running activities from Garmin Connect.&lt;/span&gt;
&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        lookback_days (int): Number of days to look back for running activities.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@MCP&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_training_program_events&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Fetches the athlete&amp;#39;s training program events in the date range specified.&lt;/span&gt;

&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        start_date: Start date in format &amp;quot;DD-MM-YYYY&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;        end_date: End date in format &amp;quot;DD-MM-YYYY&amp;quot;&lt;/span&gt;

&lt;span class="sd"&gt;    Returns:&lt;/span&gt;
&lt;span class="sd"&gt;        A string with matching VEVENTS in XML-like format.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@MCP&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_current_date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Returns the current date in ISO format.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tz&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ZoneInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;UTC&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vm"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;MCP&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;stdio&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://gist.github.com/duarteocarmo/8f1463500e0c843b6e7848e0b5466ecc"&gt;Entire file here.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Our MCP is ready. There are just short of 8984 ways to use an MCP server with an LLM. My favorite one is &lt;em&gt;still&lt;/em&gt; BoltAI. We open our MCP server configuration and add our nice little MCP server:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;mcpServers&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;oslo_marathon&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;args&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;run&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/Users/duarteocarmo/Repos/scripts/oslo_marathon_mcp.py&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;uv&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;One of the nice features of BoltAI, is the ability to create "projects". These are similar to the &lt;a href="https://help.openai.com/en/articles/10169521-using-projects-in-chatgpt"&gt;projects&lt;/a&gt; feature of ChatGPT. In short, BoltAI gives you the ability to create a template that will be used for a group of chats. I created a "Running" project where I defined the model I'd like to use, as well as the &lt;code&gt;oslo_marathon&lt;/code&gt; MCP server. BoltAI also gives you the ability to investigate the MCP server to quickly check what tools are made available to the LLM.&lt;/p&gt;
&lt;p&gt;&lt;img alt="bolt mcp server" src="https://duarteocarmo.com/images/81/mcp_bolt.webp" /&gt;&lt;/p&gt;
&lt;p&gt;Finally, once everything is configured, we can now ask the LLM something very specific according to my run. And as you can see, the LLM will call all the necessary tools in order. Just like the &lt;a href="https://arxiv.org/abs/2210.03629"&gt;ReAct paper&lt;/a&gt; introduced - the LLM now gives us much more contextualized information about my run, and searches the Pfitzinger training book for information on recovering strategies.&lt;/p&gt;
&lt;p&gt;&lt;img alt="bolt mcp calls" src="https://duarteocarmo.com/images/81/chat.webp" /&gt;&lt;/p&gt;
&lt;h3 id="final-thoughts"&gt;Final thoughts&lt;/h3&gt;
&lt;p&gt;For the past months, I've seen many people talking about MCP (and even &lt;a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/"&gt;A2A&lt;/a&gt;) as things that will revolutionize AI and how we interact with LLMs. As always, there's A LOT of claims out there about what these protocols can do, and the promise they carry. It's hype and over-excitement in its purest form.&lt;/p&gt;
&lt;p&gt;I can't help but cringe sometimes when I see a big tech CTO talking about their incredibly great MCP server. For the rest of us that work in the industry. MCP is &lt;em&gt;just&lt;/em&gt; a list of functions we give the LLM access to, wrapped in a FastAPI server.&lt;/p&gt;
&lt;p&gt;And we can't deny. Tool calling is &lt;em&gt;incredibly&lt;/em&gt; powerful, but it's &lt;em&gt;still&lt;/em&gt; tool calling. Tools should be targeted, well described, and made easy to understand by the Large Language Model. It's not about throwing a hammer around in a dark room and hoping that it will pick up an API call that was designed for programmers - it's about giving LLMs the right access in a controlled manner.&lt;/p&gt;
&lt;p&gt;Don't be fooled. You don't need to download one of the &lt;a href="https://mcp.so/"&gt;thousands&lt;/a&gt; of MCP servers of &lt;a href="https://simonwillison.net/2025/Apr/9/mcp-prompt-injection/#mixing-tools-with-untrusted-instructions-is-inherently-dangerous"&gt;dubious&lt;/a&gt; quality and security. You can just create a &lt;a href="https://gist.github.com/duarteocarmo/8f1463500e0c843b6e7848e0b5466ecc"&gt;200 line Python file&lt;/a&gt; and give it to the LLM, it has the same (probably better) effect.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Evals are all you need</title><link href="https://duarteocarmo.com/blog/evals-are-all-you-need.html" rel="alternate"/><published>2025-05-04T00:00:00+02:00</published><updated>2025-05-04T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-05-04:/blog/evals-are-all-you-need.html</id><summary type="html">&lt;p&gt;Last month, I built an app called &lt;a href="https://duarteocarmo.com/blog/taralli-home-cooked-food-tracking-without-the-bs.html"&gt;Taralli&lt;/a&gt;. It was fun to close the loop and get it out there. Still - there was an elephant in the room:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"...[calorie tracking] Accuracy isn’t great, and it makes some pretty basic mistakes. I’ve got plans to fix that..."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And things …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last month, I built an app called &lt;a href="https://duarteocarmo.com/blog/taralli-home-cooked-food-tracking-without-the-bs.html"&gt;Taralli&lt;/a&gt;. It was fun to close the loop and get it out there. Still - there was an elephant in the room:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"...[calorie tracking] Accuracy isn’t great, and it makes some pretty basic mistakes. I’ve got plans to fix that..."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And things started failing pretty early. &lt;code&gt;100g of peanut butter&lt;/code&gt; came back with 58 000 kcal - &lt;code&gt;350g of pasta&lt;/code&gt; as 60 000 kcal - quantities were all over the place. And even though most users were giving me good feedback - It was time to fix things. After some work, I managed to improve tracking accuracy from &lt;strong&gt;17% to 76%&lt;/strong&gt;. It turns out, evals are all you need.&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#naive-is-good-but-it-fails-eventually"&gt;Naive is good, but it fails eventually&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#creating-a-golden-dataset"&gt;Creating a golden dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-simple-metric"&gt;A simple metric&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#using-dspy-to-improve-the-model"&gt;Using DSPy to Improve the Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#integrating-the-improved-model"&gt;Integrating the Improved Model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-flywheel-effect"&gt;The Flywheel Effect&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#final-thoughts"&gt;Final thoughts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="naive-is-good-but-it-fails-eventually"&gt;Naive is good, but it fails eventually&lt;/h2&gt;
&lt;p&gt;I constantly preach about how we shouldn't build complicated things before knowing if they're &lt;em&gt;actually&lt;/em&gt; going to be used. Taralli's first system for calorie counting was embarrassingly simple. All it did was take a user's food description as a string - pass it through &lt;code&gt;gpt-4o-mini&lt;/code&gt; with &lt;a href="https://platform.openai.com/docs/guides/structured-outputs"&gt;structured outputs&lt;/a&gt;, and generate a JSON response. Here's the &lt;a href="https://docs.pydantic.dev/latest/"&gt;Pydantic&lt;/a&gt; model that gives that JSON format:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FoodGroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dairy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;dairy&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;meat_and_alternatives&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;meat and alternatives&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;grain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;grain&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;fruit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;fruit&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;vegetable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;vegetable&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The name of the food item&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The quantity of the food item in the user&amp;#39;s description&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;calories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Calories in kilocalories (kcal) for a single unit of the food item&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;carbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Carbohydrates in grams for a single unit of the food item&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Fat in grams for a single unit of the food item&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;protein&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Protein in grams for a single unit of the food item&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fiber&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Fiber in grams for a single unit of the food item&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;food_groups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FoodGroup&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The food groups to which the food item belongs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FoodItem&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;A list of food items with their nutritional information&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With this in place - it's relatively simple to build an API that does the following:&lt;/p&gt;
&lt;p&gt;Input: &lt;code&gt;a handful of peanuts&lt;/code&gt;, output:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;food_items&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;peanuts&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;quantity&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;calories&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;414.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;carbs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;16.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;fat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;36.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;protein&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;19.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;fiber&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;8.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;food_groups&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;meat and alternatives&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The system returns not only calories but also macronutrients (carbs, fat, protein, fiber, and food groups). And even though the model respected the format almost 100% of the time, cracks started showing pretty quickly:&lt;/p&gt;
&lt;p&gt;Input: &lt;code&gt;100g of peanut butter&lt;/code&gt;, output:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;food_items&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;peanut butter&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;quantity&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;calories&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;588.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;carbs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;20.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;fat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;protein&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;25.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;fiber&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;6.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;food_groups&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;meat and alternatives&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Wait a second, 100 x 588? That’s 58,800 calories for 100g of peanut butter. That’s &lt;em&gt;wildly&lt;/em&gt; off (the model is &lt;a href="https://www.google.com/search?q=calories+in+100g+of+peanut+butter&amp;amp;oq=calories+in+100g+of+pean&amp;amp;gs_lcrp=EgZjaHJvbWUqBwgCEAAYgAQyBggAEEUYOTIHCAEQABiABDIHCAIQABiABDIICAMQABgWGB4yCAgEEAAYFhgeMggIBRAAGBYYHjIICAYQABgWGB4yCAgHEAAYFhgeMggICBAAGBYYHjIICAkQABgWGB7SAQgzOTc0ajFqN6gCALACAA&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8"&gt;misinterpreting&lt;/a&gt; what I want for quantity here). It should have set name to "100g of peanut butter" and quantity to 1. For example.&lt;/p&gt;
&lt;h2 id="creating-a-golden-dataset"&gt;Creating a golden dataset&lt;/h2&gt;
&lt;p&gt;When I published Taralli, I made the app completely free (even though I was paying for every LLM call). I did warn users that I was logging all of their inputs into the food tracking model. Using &lt;a href="https://wandb.ai/site/weave/"&gt;W&amp;amp;B Weave&lt;/a&gt; and a simple decorator, I was able to log every input-output of the system.&lt;/p&gt;
&lt;p&gt;&lt;img alt="weave screenshot" src="https://duarteocarmo.com/images/80/weave.png" /&gt;&lt;/p&gt;
&lt;p&gt;That allowed me to start capturing real-world examples of where the prompt was failing. Looking through the data, I found several cases where my zero-shot prompt was just not cutting it (wrong food groups, lots of wrong quantities). Using this data, I started collecting a &lt;em&gt;golden dataset&lt;/em&gt;: a collection of food descriptions and their corresponding nutritional analysis. Using OpenAI's o3 and Google's Gemini 2.5 Pro - I was able to make sure all the quantities, descriptions, and food groups were exactly like I wanted them.&lt;/p&gt;
&lt;p&gt;But I still wanted a way to visualize, edit and update the dataset. So I built a small visualizer where I could see everything that was going on, and edit whatever was wrong. 200% vibe coded.&lt;/p&gt;
&lt;p&gt;&lt;img alt="dataset_reviewer" src="https://duarteocarmo.com/images/80/reviewer.png" /&gt;&lt;/p&gt;
&lt;h2 id="a-simple-metric"&gt;A simple metric&lt;/h2&gt;
&lt;p&gt;With my golden dataset in place, I needed a way of evaluating whether a given prediction from the prompt was &lt;em&gt;good&lt;/em&gt; or &lt;em&gt;bad&lt;/em&gt;. By looking at my entries - the most common mistakes were pretty obvious. Either it was calorie totals - or missing food groups.&lt;/p&gt;
&lt;p&gt;My approach was very basic (again) - and could be even more refined. Given a certain food description:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Is the total calories of the prediction within 10% of the total calories of the golden example?&lt;/li&gt;
&lt;li&gt;Is there an overlap between the food groups of the golden example vs. the predicted one?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If the answer to &lt;strong&gt;both&lt;/strong&gt; questions is 'Yes' - then the prediction is good. In code, that roughly translates to:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;gold_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;pred_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_totals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
        &lt;span class="n"&gt;calories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;calories&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;quantity&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;union&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;food_groups&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;calories&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;groups&lt;/span&gt;

    &lt;span class="n"&gt;gold_cal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gold_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_totals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pred_cal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_totals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;calories_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isclose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold_cal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred_cal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rel_tol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;gold_groups&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;pred_groups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# at least one side has groups&lt;/span&gt;
        &lt;span class="n"&gt;groups_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold_groups&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;pred_groups&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;groups_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# both lists empty&lt;/span&gt;

    &lt;span class="n"&gt;sum_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;calories_ok&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;groups_ok&lt;/span&gt;
    &lt;span class="n"&gt;is_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sum_ok&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="c1"&gt;# if both are ok&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;is_ok&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now that we have a metric - we can make number go up.&lt;/p&gt;
&lt;h2 id="using-dspy-to-improve-the-model"&gt;Using DSPy to Improve the Model&lt;/h2&gt;
&lt;p&gt;The first step was to use DSPy, &lt;a href="https://duarteocarmo.com/blog/what-the-hell-is-gqpa-anyway.html"&gt;a framework I've talked about before&lt;/a&gt;, to evaluate our &lt;em&gt;current&lt;/em&gt; performance. After splitting my golden dataset into training and testing splits, and I had my metric ready, I could easily assess how well my current system was performing:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_example&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_example&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classify_food&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;food_description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val_example&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;concurrent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;process_example&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val_set&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Which resulted in:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;=====EVALUATION=====
example.food_description=&amp;#39;a banana&amp;#39;
gold_cal=105
pred_cal=105
CALORIES_OK? True
gold_groups={&amp;#39;fruit&amp;#39;}
pred_groups={&amp;lt;FoodGroup.fruit: &amp;#39;fruit&amp;#39;&amp;gt;}
GROUPS_OK? True
is_ok=True
=====EVALUATION=====
example.food_description=&amp;#39;300 g stew with chicken, potatoes and vegetables&amp;#39;
gold_cal=270
pred_cal=360
CALORIES_OK? False
gold_groups={&amp;#39;vegetable&amp;#39;, &amp;#39;meat and alternatives&amp;#39;}
pred_groups={&amp;lt;FoodGroup.grain: &amp;#39;grain&amp;#39;&amp;gt;, &amp;lt;FoodGroup.vegetable: &amp;#39;vegetable&amp;#39;&amp;gt;, &amp;lt;FoodGroup.meat_and_alternatives: &amp;#39;meat and alternatives&amp;#39;&amp;gt;}
GROUPS_OK? True
is_ok=False

# more examples, truncated for brevity..

Score: 17.24%
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The score of my production system on the dataset was only about &lt;strong&gt;17%&lt;/strong&gt;, not great - we know. To get yet another baseline for my system, I first tried using &lt;code&gt;Gemini 2.5 Flash&lt;/code&gt;, as I wanted to keep costs reasonable while still getting better performance. I evaluated it on the dataset using DSPy:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;classify&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;gemini_flash_preview&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;openrouter/google/gemini-2.5-flash-preview:nitro&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gemini_flash_preview&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;evaluator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;devset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;display_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;display_table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Average Metric: 7.00 / 28 (25.0%): 100%|██████████| 29/29 [00:00&amp;lt;00:00, 960.39it/s]&lt;/span&gt;

&lt;span class="c1"&gt;# 2025/04/30 19:46:41 INFO dspy.evaluate.evaluate: Average Metric: 7.0 / 29 (24.1%)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Using this zero-shot approach, the score was about 25% - better than my &lt;code&gt;GPT-4o mini&lt;/code&gt; implementation, but still far from acceptable for users.&lt;/p&gt;
&lt;p&gt;I decided to try another &lt;code&gt;DSPy&lt;/code&gt; optimization technique called &lt;code&gt;BootstrapFewShotWithRandomSearch&lt;/code&gt;. This approach finds optimal examples from the training dataset to include in the prompt, which can dramatically improve performance. Here's the code:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gemini_flash_preview&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;tp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BootstrapFewShotWithRandomSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_labeled_demos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_candidate_programs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;bootstrap_fewshot_random_search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;classify&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deepcopy&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;valset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# final evaluation&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gemini_flash_preview&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;evaluator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;devset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;display_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;display_table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bootstrap_fewshot_random_search&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;eval_metric&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Average Metric: 22.00 / 29 (75.9%): 100%|██████████| 29/29 [00:00&amp;lt;00:00, 3096.37it/s]&lt;/span&gt;

&lt;span class="c1"&gt;# 2025/04/30 19:46:45 INFO dspy.evaluate.evaluate: Average Metric: 22 / 29 (75.9%)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After a few minutes of optimization and testing, we achieved 76% accuracy on our validation dataset - an incredible improvement over both our previous system and the zero-shot approach.&lt;/p&gt;
&lt;p&gt;There is a trade-off: the optimized prompt includes several few-shot examples, which can impact response time. However, since we're using the relatively fast &lt;code&gt;Gemini 2.5 Flash&lt;/code&gt; model, it doesn't matter &lt;em&gt;that&lt;/em&gt; much.&lt;/p&gt;
&lt;p&gt;We can see that for our problematic example of "100 g of peanut butter," the new model now provides a much more reasonable result:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gemini_flash_preview&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bootstrap_fewshot_random_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;food_description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;100 g of peanut butter&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Prediction(food_items=[FoodItem(name=&amp;#39;Peanut butter (100g)&amp;#39;, quantity=1.0, calories=588.0, carbs=20.0, fat=50.0, protein=25.0, fiber=6.0, food_groups=[&amp;#39;meat and alternatives&amp;#39;])])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/duarteocarmo/83487a0da408c2470c9b23b257e66931"&gt;the full prompt&lt;/a&gt; that DSPy uses for the example above. As you can see, it's essentially a few-shot approach to solving the problem, and it works surprisingly well.&lt;/p&gt;
&lt;h2 id="integrating-the-improved-model"&gt;Integrating the Improved Model&lt;/h2&gt;
&lt;p&gt;Now that we have an optimized prompt that performs well, we can save it and integrate it into our &lt;code&gt;FastAPI&lt;/code&gt; application:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# save our dspy program&lt;/span&gt;

&lt;span class="n"&gt;PATH_TO_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;../optimized_prompts/bootstrap_fewshot_random_search.json&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;bootstrap_fewshot_random_search&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PATH_TO_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create a function to process food descriptions&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_classifier_async&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Predict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;lm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LM_NAME&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;async_max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dspy_program&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dspy_program&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PATH_TO_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dspy_program&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;asyncify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dspy_program&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dspy_program&lt;/span&gt;

&lt;span class="c1"&gt;# create an async processor&lt;/span&gt;

&lt;span class="n"&gt;DSPY_PROGRAM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_classifier_async&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nd"&gt;@weave&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_calories_for_dspy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;food&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Getting calories for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;food&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;DSPY_PROGRAM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;food_description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;food&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;food_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;food_items&lt;/span&gt;
    &lt;span class="n"&gt;food_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;parsed_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;food_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Got response: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;parsed_message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed_message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Expected NutritionAnalysis response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parsed_message&lt;/span&gt;

&lt;span class="c1"&gt;# integrate it into our fastapi application&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;/calories-v2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;NutritionAnalysis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Get calories for a food item from request body using DSPy.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_calories_v2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FoodRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Get calories for a food item from request body using DSPy.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Invalid or missing API Key&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;get_calories_for_dspy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;food_description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Along with some other new features in &lt;code&gt;Taralli&lt;/code&gt;, I integrated this new API endpoint, and now all users of the app benefit from our improved food prediction model, which is much more accurate and an overall better experience. Still room for improvement though!&lt;/p&gt;
&lt;h2 id="the-flywheel-effect"&gt;The Flywheel Effect&lt;/h2&gt;
&lt;p&gt;What I like about this approach is that it creates a flywheel effect. As more users interact with the app, I collect more data, I can correct/review that data and add it to my golden dataset. This flywheel allows me to continuously update the prompt as we go.&lt;/p&gt;
&lt;p&gt;&lt;img alt="flywheel of continuous improvement" src="https://duarteocarmo.com/images/80/flywheel.png" /&gt;&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;I've seen many projects in the wild just using a prompt or model and leaving it as is. Without a good idea of how good (or probably bad) it really is. My first step - is - as always - just put it out there. There are just too many unknowns to get stuck on the 0 to 1 stage.&lt;/p&gt;
&lt;p&gt;Once things are out there, then it's time to start asking yourself: OK - what does good look like? With some simple monitoring, you'll start understanding what REALLY matters. Once that's done - then - &lt;strong&gt;evaluations are truly all you need&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Also - check out &lt;a href="https://apps.apple.com/dk/app/taralli/id6743634022"&gt;Taralli&lt;/a&gt; - and let me know what you think!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Taralli in action" src="https://duarteocarmo.com/images/80/taralli.webp" /&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Taralli: Home-Cooked Food Tracking Without the BS</title><link href="https://duarteocarmo.com/blog/taralli-home-cooked-food-tracking-without-the-bs.html" rel="alternate"/><published>2025-04-07T00:00:00+02:00</published><updated>2025-04-07T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-04-07:/blog/taralli-home-cooked-food-tracking-without-the-bs.html</id><summary type="html">&lt;p&gt;For years, I never really cared about &lt;em&gt;what&lt;/em&gt; I ate, &lt;em&gt;how much&lt;/em&gt; I ate, or &lt;em&gt;when&lt;/em&gt; I ate it.&lt;/p&gt;
&lt;p&gt;But sometime late last year, I finally decided to listen to Vitto. I started noticing that what I eat actually had an impact on how much I ran, how well I …&lt;/p&gt;</summary><content type="html">&lt;p&gt;For years, I never really cared about &lt;em&gt;what&lt;/em&gt; I ate, &lt;em&gt;how much&lt;/em&gt; I ate, or &lt;em&gt;when&lt;/em&gt; I ate it.&lt;/p&gt;
&lt;p&gt;But sometime late last year, I finally decided to listen to Vitto. I started noticing that what I eat actually had an impact on how much I ran, how well I slept - and generally, how good I felt. So I decided to do what most people (at a certain point) do: I downloaded &lt;a href="https://www.myfitnesspal.com/"&gt;the&lt;/a&gt; &lt;a href="https://macrofactorapp.com/macrofactor/"&gt;usual&lt;/a&gt; &lt;a href="https://lifesum.com/"&gt;suspects&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Frustration quickly followed. I don't want the pizza from Trader Joe's, I want frozen pizza from &lt;a href="https://www.pingodoce.pt/"&gt;Pingo Doce&lt;/a&gt;. I'm not eating the Doritos sold in the American supermarket, I just ate the healthy ones from &lt;a href="https://kvickly.coop.dk/"&gt;Kvickly&lt;/a&gt;. I want to track two handfuls of peanuts. I want to track a &lt;em&gt;pão de deus com queijo&lt;/em&gt; from the &lt;a href="https://apadariaportuguesa.pt/"&gt;Padaria Portuguesa&lt;/a&gt;. With the usual suspects, it was painful. Everything was either too specific, too general, or matched a food database I didn't care about.&lt;/p&gt;
&lt;p&gt;After talking with Nuno and a couple of other friends, I quickly realized they weren't actually taking photos, scanning barcodes, or meticulously entering calories and macros in these apps. They were asking ChatGPT. That made sense. For some - certainly for me, it's actually more important to track &lt;em&gt;something&lt;/em&gt; than to be extremely accurate. Perfect is the enemy of good. It still holds.&lt;/p&gt;
&lt;p&gt;Enter &lt;a href="https://apps.apple.com/dk/app/taralli/id6743634022"&gt;Taralli&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="taralli app screenshots" src="https://duarteocarmo.com/images/79/taralli-screenshot-food.webp" /&gt;&lt;/p&gt;
&lt;p&gt;Taralli allows you to track &lt;em&gt;anything&lt;/em&gt; you eat (really!). A &lt;a href="https://www.pingodoce.pt/receitas/pao-com-chourico/"&gt;pão com chouriço&lt;/a&gt;, a &lt;a href="https://encomendas.apadariaportuguesa.pt/produto/pao-de-deus/"&gt;pão de Deus&lt;/a&gt;, half a handful of peanuts – whatever language, whatever text, whatever you're eating, Taralli can handle it. Just type what you ate, and it'll handle the rest. It gives you macros, calories, fiber, and food group. It also has a bunch of analytics from the past week so you can see what's going on at a high level. You can also connect it to Apple Health to track your weight (if you're interested in that sort of stuff.)&lt;/p&gt;
&lt;p&gt;Taralli is built using &lt;a href="https://developer.apple.com/xcode/swiftui/"&gt;SwiftUI&lt;/a&gt;. What would have taken me 3-4 months and a lot of googling about SwiftUI took me something like a couple of weeks and the help of Claude 3.7 Sonnet (and still a lot of googling about SwiftUI I have to admit). I'm not sure I would say the whole thing was &lt;a href="https://x.com/karpathy/status/1886192184808149383?lang=en"&gt;"vibe coded"&lt;/a&gt;. I mean, I &lt;em&gt;still&lt;/em&gt; found some bugs, I &lt;em&gt;still&lt;/em&gt; had to make some important decisions about how things work together and are architected - otherwise this app would be a mess. But I have to admit that building the whole thing is way easier now. Except the Apple review process, I can confirm that one is still painful.&lt;/p&gt;
&lt;p&gt;For estimating the calories I'm using &lt;code&gt;gpt-4o-mini&lt;/code&gt; with some &lt;a href="https://platform.openai.com/docs/guides/structured-outputs"&gt;structured generation&lt;/a&gt; served over FastAPI. It actually sucks right now. Accuracy isn’t great, and it makes some pretty basic mistakes. I’ve got plans to fix that — but I’ll leave them for another release. &lt;a href="https://duarteocarmo.com/blog/simple-software"&gt;Remember&lt;/a&gt;, first make it work, &lt;em&gt;then&lt;/em&gt; make it pretty.&lt;/p&gt;
&lt;p&gt;&lt;img alt="taralli app screenshots" src="https://duarteocarmo.com/images/79/taralli-screenshot-graphs.webp" /&gt;&lt;/p&gt;
&lt;p&gt;You'll notice that unlike other apps, Taralli doesn't tell you what to do. That's because I don't care what you do. The goal of Taralli is to help you track. And its goal stops there. Just write anything in the text box. The more specific you are, the better.&lt;/p&gt;
&lt;p&gt;So here's &lt;a href="https://apps.apple.com/dk/app/taralli/id6743634022"&gt;Taralli&lt;/a&gt;. Food tracking without the BS.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Goodbye Kindle, I don't think I'll miss you.</title><link href="https://duarteocarmo.com/blog/goodbye-kindle-i-dont-think-ill-miss-you.html" rel="alternate"/><published>2025-03-29T00:00:00+01:00</published><updated>2025-03-29T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-03-29:/blog/goodbye-kindle-i-dont-think-ill-miss-you.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/78/boox_3.webp" alt="boox palma on table" class="shadow"&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;Even though it never &lt;em&gt;really&lt;/em&gt; replaced physical books for me, I've been a &lt;a href="https://kindle-highlights.email/"&gt;big Kindle user&lt;/a&gt; for many years. The ability to take hundreds of books with me anywhere is priceless. Vitto even has a running joke that I tend to lose them on the plane.&lt;/p&gt;
&lt;p&gt;I love my Kindle …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/78/boox_3.webp" alt="boox palma on table" class="shadow"&gt;
&lt;br&gt;&lt;/p&gt;
&lt;p&gt;Even though it never &lt;em&gt;really&lt;/em&gt; replaced physical books for me, I've been a &lt;a href="https://kindle-highlights.email/"&gt;big Kindle user&lt;/a&gt; for many years. The ability to take hundreds of books with me anywhere is priceless. Vitto even has a running joke that I tend to lose them on the plane.&lt;/p&gt;
&lt;p&gt;I love my Kindle so much I &lt;em&gt;never&lt;/em&gt; thought I'd replace it.&lt;/p&gt;
&lt;p&gt;But I did. And it feels GREAT.&lt;/p&gt;
&lt;p&gt;I first heard of the &lt;a href="https://shop.boox.com/products/palma"&gt;Boox Palma&lt;/a&gt; from &lt;a href="https://riccardo.im/"&gt;Riccardo&lt;/a&gt;, and then repeatedly from &lt;a href="https://www.theverge.com/24184777/boox-palma-e-ink-smartphone-reader"&gt;David&lt;/a&gt;. When nerds recommend a new gadget, I'm usually skeptical, but hey, these are &lt;em&gt;respectable&lt;/em&gt; nerds after all. After 2/3 weeks of hesitation, asking Vitto repeatedly if I should go for it (as usual) - I decided to finally press the button.&lt;/p&gt;
&lt;p&gt;From the moment it arrived, I never touched my Kindle ever again.&lt;/p&gt;
&lt;p&gt;There are many good things about the Boox Palma. But I'll start with what hit me first: It's well built. One of my favourite things about the Kindle was that those things would last a long time (if you didn't leave them on the plane). I'm not sure the Boox Palma will last that long - but for the past 4 months of use, I can throw it pretty much everywhere - it's sturdy.&lt;/p&gt;
&lt;p&gt;A Kindle that fits in your pocket. Now how great does that sound? One of my all-time annoyances with the Kindle was the fact that it was great to carry - but ONLY as long as you have a backpack. For airplane travel, it's great. But for every other time I left home without a bag - I just didn't take it. Now, with something that fits in my pocket, every occasion is a good occasion to take my e-reader with me. So I've found myself reading in places I would never have before. Waiting for a friend? Going out for lunch somewhere? Which is one of the big reasons why I love this gadget: it makes me read more!&lt;/p&gt;
&lt;p&gt;We can't talk about an e-reader without talking about the display. I'm by no means a display expert, but it's pretty sharp. You can tweak the refresh modes and filters - but I don't really tend to do so. I do tend to adjust the color temperature. Quite a lot, actually. If I'm reading at night I want something yellower. If I'm reading during the day I want something colder. My only complaint about the display is the lack of a native dark mode. I wish I could touch a button and &lt;em&gt;all&lt;/em&gt; the colors in my device would just invert. But either that option isn't available, or I haven't found it.&lt;/p&gt;
&lt;p&gt;No more Amazon. No more kindle store, no more 'send-to-kindle', no more 'we could not send your file', no more 'when does this thing sync so I can see my book'. Bye Amazon. With the Boox, all I need is Dropbox. The Boox Palma is something the Kindle never was: A full-fledged Android phone. I can install whatever I want, whenever I want. My RSS reader? Check. My newspaper apps? Check. The actual Kindle app? Also!&lt;/p&gt;
&lt;p&gt;But I don't use the Kindle app on my Boox. One of the most pleasant surprises when using the Boox Palma has been KOReader &lt;sup id="sf-goodbye-kindle-i-dont-think-ill-miss-you-1-back"&gt;&lt;a href="#sf-goodbye-kindle-i-dont-think-ill-miss-you-1" class="simple-footnote" title="Kevin Wammer's blog"&gt;1&lt;/a&gt;&lt;/sup&gt;. It's the reading app I've always wanted. And it's where I spend 99% of the time on my Boox Palma. Incredible customization (the fonts, the rendering, the images), dark mode, a super pleasant progress bar. I just love reading in this app. Best of all? It's free and open source. I have to take my hat off to the maintainers. It's another big reason why I love the Boox.&lt;/p&gt;
&lt;p&gt;And it lasts a long time! I tend to read ~1 hour every night, and more during traveling, and it comfortably lasts 2+ weeks without charging. Worth noting that I don't watch YouTube videos on it. You could. But not sure what the point would be.&lt;/p&gt;
&lt;p&gt;The Boox Palma is not perfect. The company has tried to shove AI down the users' throats a couple of times. I think I've received a single software update in ~4 months, which is not the most encouraging. And yes - it's an Android phone, but for everything else other than reading - it will feel sluggish. Expected? Probably. But it also costs 2-3 times what a Kindle does.&lt;/p&gt;
&lt;p&gt;But the reason why I loved my Kindle was not because it was powerful, or because of its great software. It was because it just made me read more often.&lt;/p&gt;
&lt;p&gt;And that's also why I love the Boox Palma. Because it &lt;em&gt;makes&lt;/em&gt; me read more.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Apps I use on the Boox:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://koreader.rocks/"&gt;KOReader&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://getpocket.com/"&gt;Pocket&lt;/a&gt; (not a nice experience - switching to &lt;a href="https://raindrop.io/"&gt;Raindrop&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.economist.com/"&gt;The Economist&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.publico.pt/"&gt;Publico&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://play.google.com/store/apps/details?id=com.seazon.feedme&amp;amp;hl=en&amp;amp;pli=1"&gt;FeedMe&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Everything else is in a folder I don't touch&lt;/li&gt;
&lt;/ul&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-goodbye-kindle-i-dont-think-ill-miss-you-1"&gt;Kevin Wammer's &lt;a href="https://cliophate.wtf/"&gt;blog&lt;/a&gt; &lt;a href="#sf-goodbye-kindle-i-dont-think-ill-miss-you-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>You also hate SQL? Let the LLM handle it</title><link href="https://duarteocarmo.com/blog/you-also-hate-sql-let-the-llm-handle-it.html" rel="alternate"/><published>2025-03-09T00:00:00+01:00</published><updated>2025-03-09T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-03-09:/blog/you-also-hate-sql-let-the-llm-handle-it.html</id><summary type="html">&lt;p&gt;Last year I made an effort to speak less and learn more. However, I still had the opportunity to present at a couple of conferences. One of them was &lt;a href="https://www.pyconwroclaw.com/"&gt;PyCon Wroclaw&lt;/a&gt;. The main goal was to talk about a couple of interesting paradigms I've come across while using LLMs to …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last year I made an effort to speak less and learn more. However, I still had the opportunity to present at a couple of conferences. One of them was &lt;a href="https://www.pyconwroclaw.com/"&gt;PyCon Wroclaw&lt;/a&gt;. The main goal was to talk about a couple of interesting paradigms I've come across while using LLMs to help me solve some Text-to-SQL problems.&lt;/p&gt;
&lt;p&gt;Unfortunately the recording came out a bit broken. So I decided to copy a typical post from &lt;a href="https://simonwillison.net/2025/Mar/8/nicar-llms/"&gt;Simon's blog&lt;/a&gt;, and do a small slide walk through here. I won't promise I'll do this for the rest of my &lt;a href="/talks"&gt;talks&lt;/a&gt;. But I think it could be a pretty interesting read.&lt;/p&gt;
&lt;p&gt;I started with my typical introduction:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0001.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0002.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;The goal of the talk was to walk through a 'grab bag' of client cases I've come across recently and five different lessons I've learned.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0004.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;The first chapter was all about setting expectations right. When clients come with a problem nowadays, they always expect a "ChatGPT" style of product.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0005.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0006.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;But ChatGPT isn't a simple product. It's actually a fairly complex piece of software.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0007.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0008.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here I introduced what I call the &lt;em&gt;no regrets move&lt;/em&gt;. I.e., what is the minimum thing that you can solve for a certain problem that will always be valuable? In the case of text-to-sql, it's &lt;em&gt;not&lt;/em&gt; the interface. It's just answering questions based on your data.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0009.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;The second chapter was all about getting &lt;em&gt;something out&lt;/em&gt;. Quick. I'm convinced that at least 50% of engineering problems are due to someone developing something in the basement instead of failing fast.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0010.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0011.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here I mentioned some tools to get a ChatGPT interface relatively fast. So you can focus on what matters. &lt;a href="https://gist.github.com/duarteocarmo/85435ce9209d1824f68e11b6126ab5c9"&gt;Here's&lt;/a&gt; the code for a ChatGPT-like interface using Streamlit that already includes tools like plotting.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0012.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;This is my favorite advice for any sort of engineering:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0013.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;In the 3rd chapter I finally start focusing on the task at hand. Text-to-SQL.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0014.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I took some time to present the &lt;a href="https://bird-bench.github.io/"&gt;BIRD-SQL&lt;/a&gt; benchmark. One of the most popular benchmarks for text-to-sql.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0015.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;When looking at the rankings, we can see that the best performing text-to-sql system has an accuracy ~75%. So in your best case scenario. Whatever you build. You can already expect your LLM to get &lt;em&gt;at least&lt;/em&gt; 1/4 questions wrong.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0016.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0017.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I cover two 'typical' approaches to text-to-sql. And how these approaches can go from something very simple in the right - just by stuffing information into the prompt. To something a bit more &lt;a href="https://arxiv.org/abs/2405.16755"&gt;complex&lt;/a&gt; in the left.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0018.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;But in general. Stuffing information into the prompt is what most people are doing.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0019.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;In the fourth chapter, I go into a good baseline to generate SQL to answer questions about a particular database.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0020.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I start by talking about the funny 'please give me json' prompts. And show this small bit straight from one of Apple's prompts (Probably Apple Intelligence), where they beg the LLM to return json.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0021.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;But for me, structured outputs are critical for a maintainable and well-built LLM application. So I also take some time to talk about that.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0022.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I start by introducing &lt;a href="https://python.useinstructor.com/"&gt;instructor&lt;/a&gt;, one of my favorite LLM libraries.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0023.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Of course, you don't need to use instructor. You can go with &lt;a href="https://platform.openai.com/docs/guides/structured-outputs"&gt;Structured Outputs&lt;/a&gt; as well.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0024.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;But in general - I absolutely hate 'OpenAI code'. E.g., code that is completely locked into OpenAI or any other AI provider. So I present &lt;a href="https://docs.litellm.ai/#litellm-python-sdk"&gt;LiteLLM&lt;/a&gt;. A great tool to keep your code provider agnostic.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0025.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here's where &lt;a href="https://docs.litellm.ai/#litellm-python-sdk"&gt;LiteLLM&lt;/a&gt; really shines. When you can switch out &lt;em&gt;any&lt;/em&gt; model from &lt;em&gt;any&lt;/em&gt; provider, just by switching the prompt.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0026.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;As an example I use a simple SQLite database of my running data from Strava.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0027.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here's the code to load the Strava data and a small preview of the data.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0029.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Time to build a basic text-to-sql prompt.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0030.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;My prompt structure. It includes a Preamble, the CREATE TABLE statement so that the LLM knows the structure, some response guidelines, and the user question.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0031.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;The issue with most of these systems is that we have no way to ensure a certain query is actually valid..&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0032.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here's a particularly nasty piece of code. I use &lt;a href="https://python.useinstructor.com/"&gt;Instructor&lt;/a&gt; to generate the SQL from a question, but also include a custom validator. This will ensure that whatever the LLM generates can actually be run against the database. This is of course, very dangerous. So one should not put this code in production, since you'll be opening yourself to all sorts of problems. But it's a nice way to ensure &lt;em&gt;validity&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Here's the Frankenstein class:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0033.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;And the function we use to call it.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0034.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here are some examples of running this system. With two questions. For one, it responds well. For the other, it completely fails the answer. But both answers are smooth and valid from the SQL perspective. But that doesn't mean they are correct.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0035.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;So now that we are getting valid answers, how can we ensure that they are actually good?&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0036.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Once we've started with a baseline. I take Chapter 5 to talk about how to make things better once you have this baseline done.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0037.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Mandatory meme about testing and performance. If you never test, your tests never fail. If you never evaluate, your LLM system is perfect.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0038.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;To improve your text-to-sql system, you actually need to measure it.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0039.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;This is just like traditional Machine Learning. We can go back to one of the most well-known metrics: Accuracy.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0040.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;To measure accuracy, we can create some questions based on the data from our Database. Use our LLM system to respond to those same questions, and compare both answers. That's it!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0042.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;We can use instructor/structured outputs again here, to understand if our LLM answer matches the information in our baseline answer.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0043.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I really like looking at data, and evaluating where things go wrong. But I'm skeptical of the myriad of tools coming out to evaluate LLMs. And I'm a big fan of Google Sheets. Mainly because when I look at the data, I probably also look at it with some sort of subject matter expert. So the fewer 'things' between us and the data, the better. And Google Sheets are fine.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0044.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;Here's an example of how I use Google Sheets to look at the performance of this text-to-sql system. You can see the question we want answered. The SQL and answer from our ground truth, the SQL and answer from the LLM system, and its evaluation. We also include a reason why a certain case was FAIL/PARTIAL.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0045.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;This is all nice and dandy. But we should never forget the big picture. So what?&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0046.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;In the final slide, I come back to the three most important lessons: Iterate quickly, make the SQL generation robust and explainable, and iterate from a baseline.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/77/pycon-wroclaw-llm-text-to-sql_page-0047.webp" class="shadow"/&gt;&lt;/p&gt;
&lt;p&gt;I then took some questions and wrapped things up!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Simple AI tools</title><link href="https://duarteocarmo.com/blog/simple-ai-tools.html" rel="alternate"/><published>2025-02-18T00:00:00+01:00</published><updated>2025-02-18T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-02-18:/blog/simple-ai-tools.html</id><summary type="html">&lt;p&gt;Last week over lunch, &lt;a href="https://www.parraguezr.net/"&gt;Pedro&lt;/a&gt; was telling me how he uses &lt;a href="https://www.perplexity.ai/"&gt;Perplexity&lt;/a&gt; to improve the way he searches around the web. He mentioned we're entering a phase where we have at least 20 'AI' tools to boost our productivity - and the tough thing now is choosing which ones to use …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last week over lunch, &lt;a href="https://www.parraguezr.net/"&gt;Pedro&lt;/a&gt; was telling me how he uses &lt;a href="https://www.perplexity.ai/"&gt;Perplexity&lt;/a&gt; to improve the way he searches around the web. He mentioned we're entering a phase where we have at least 20 'AI' tools to boost our productivity - and the tough thing now is choosing which ones to use.&lt;/p&gt;
&lt;p&gt;With so much buzz around AI - it's easy to feel overwhelmed and think "I don't use that, maybe I should?". Now - I don't know which tools are the best for your particular use case. But for the past 6-10 months, I've settled on a nice group of tools I think more people would benefit from.&lt;/p&gt;
&lt;h2 id="for-everything-else"&gt;For everything else&lt;/h2&gt;
&lt;p&gt;I recently upgraded my MacBook to an M3 Max with 64GB of RAM. One of the main reasons I got it was so that I could run a nice suite of Large Language Models (LLMs) locally. &lt;a href="https://ollama.com/"&gt;Ollama&lt;/a&gt; has been at the center of that workflow - (&lt;a href="https://www.reddit.com/r/ollama/comments/1idqxto/comment/ma19shz/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button"&gt;don't forget&lt;/a&gt;). Whenever a new &lt;a href="https://ollama.com/library/deepseek-r1"&gt;model&lt;/a&gt; comes out, it's just an &lt;code&gt;ollama pull&lt;/code&gt; away - and you're up and running.&lt;/p&gt;
&lt;p&gt;Models are great - but just like ChatGPT showed everyone, the interface really makes the difference. Since I'm a Mac user, I wanted a unified way to interact with all the models I'm interested in - in a nice interface. &lt;a href="https://boltai.com?aff=2OOJDR"&gt;BoltAI&lt;/a&gt; does just that.&lt;/p&gt;
&lt;p&gt;There are a lot of things I like about BoltAI. But my favorite one is that I can use a single interface for just about any model under the sun (and running on my laptop). I've also setup &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt;, so I can try out any tropical model/provider combo with just a few clicks. It's a one time purchase - and &lt;a href="https://danielnguyen.me/"&gt;Daniel&lt;/a&gt; is &lt;a href="https://boltai.canny.io/"&gt;super responsive&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="BoltAI LLM application" src="https://duarteocarmo.com/images/76/boltai.webp" /&gt;&lt;/p&gt;
&lt;p&gt;There are two things that are currently missing from my workflow for everyday tools and I haven't solved yet. The first is some sort of canvas interface, &lt;a href="https://support.anthropic.com/en/articles/9487310-what-are-artifacts-and-how-do-i-use-them"&gt;artifacts&lt;/a&gt; are my favorite for this. The second thing I'm missing is some sort of LLM + web search integration. I've been experimenting with &lt;a href="https://www.perplexity.ai/"&gt;Perplexity&lt;/a&gt;, &lt;a href="https://www.phind.com/"&gt;Phind&lt;/a&gt;, and some &lt;a href="https://docs.boltai.com/docs/chat-ui/ai-plugins"&gt;plugins&lt;/a&gt; for BoltAI - but I'm still skeptical of LLM's interpretation of search results.&lt;/p&gt;
&lt;p&gt;Great, what about engineering?&lt;/p&gt;
&lt;h2 id="for-code"&gt;For code&lt;/h2&gt;
&lt;p&gt;My first purchase of any AI service at all was (and still is) GitHub's &lt;a href="https://github.com/features/copilot"&gt;copilot&lt;/a&gt;. I've tried everything under the sun - &lt;a href="https://codeium.com/"&gt;Codeium&lt;/a&gt;, &lt;a href="https://codeium.com/"&gt;SuperMaven&lt;/a&gt;, but none comes quite close to GitHub. I consider it a commodity now, just like my LSP's completion.&lt;/p&gt;
&lt;p&gt;I'm still, largely a &lt;a href="https://github.com/duarteocarmo/dotfiles/tree/master/.config/nvim"&gt;NeoVim&lt;/a&gt; user. And the one of the most useful LLM tools/plugins I've used with it is &lt;a href="https://github.com/Robitx/gp.nvim"&gt;Gp.nvim&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It's an amazing plugin. It allows me to generate new code, re-write portions of the code base, or even create a floating chat window to chat with a particular part of the codebase I want to understand better. Oh, and it also support Ollama models - so that the magic doesn't stop - even when I'm on an airplane.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Gif of GpNvim in action" src="https://duarteocarmo.com/images/76/gpnvim.gif" /&gt;&lt;/p&gt;
&lt;p&gt;Even if I like NeoVim, I'll admit that for certain projects - it doesn't quite cut it. For larger codebases, using notebooks, or when I &lt;em&gt;feel like it&lt;/em&gt;, &lt;a href="https://www.cursor.com/"&gt;Cursor&lt;/a&gt; has also been a booster. Particularly when you're creating something new, rather tweaking something that exists already. A small static page - some css styling on a template - it can handle most of that.&lt;/p&gt;
&lt;p&gt;A tool I've been exploring more lately is &lt;a href="https://aider.chat/"&gt;Aider&lt;/a&gt;. Think of it as Cursor, but terminal based - with a git integration. Aider gives me a bit more control over the changes I want it to make. It allows me to only select certain files where it operates, it can also run commands - and examine their output. It's a super nice tool - especially if you prefer a terminal + git workflow. As an added bonus, Aider also supports &lt;a href="https://aider.chat/docs/llms/ollama.html"&gt;local&lt;/a&gt; models. To be completely honest - I think I've barely even &lt;a href="https://aider.chat/docs/usage/commands.html"&gt;scratched&lt;/a&gt; what's possible with Aider.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Aider LLM tool in action" src="https://duarteocarmo.com/images/76/aider.gif" /&gt;&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;On tools.&lt;/strong&gt; I agree with Pedro - there's no shortage of tools out there. Everyone will tell you to use X, use Y, or charge you 20 USD for it. Like most things, tools are only as good as you're productive with them - and very person specific. The best tool for me is probably not the best tool for you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Exploration is key.&lt;/strong&gt; I believe one of the most important things is to keep exploring what's out there. Don't dismiss something just because it's new. A great new &lt;a href="https://www.deepseek.com/"&gt;model&lt;/a&gt;? Take it for a spin.. A transformative &lt;a href="https://www.all-hands.dev/"&gt;IDE&lt;/a&gt;? Give it a shot and see what it's capable of. You might just find something you like - and &lt;em&gt;a lot&lt;/em&gt; of things you don't.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local first.&lt;/strong&gt; If there's something we learned in the last few months, is that we probably don't need so much compute like we thought we did. Local models are getting really good, really fast. You might have noticed that almost all tools shared here also support local models that I'm running on my own machine? Are they as good as the big ones? No. Well, at least &lt;em&gt;not yet&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Keep exploring.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>What the hell Is GPQA, anyway?</title><link href="https://duarteocarmo.com/blog/what-the-hell-is-gqpa-anyway.html" rel="alternate"/><published>2025-01-15T00:00:00+01:00</published><updated>2025-01-15T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2025-01-15:/blog/what-the-hell-is-gqpa-anyway.html</id><summary type="html">&lt;p&gt;In the period of the generative AI boom, everything is happening all at once. Every day a new model comes out, every week a billion dollar company teases a great new advancement.&lt;/p&gt;
&lt;p&gt;More often than not, they'll show us &lt;a href="https://www.anthropic.com/news/claude-3-family"&gt;some&lt;/a&gt; &lt;a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/"&gt;fancy&lt;/a&gt; &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;version&lt;/a&gt; of this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/75/0c345aa068fce110d80e914f8cd8898d.png" alt="Benchmark example" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.theverge.com/2024/10/28/24281965/honestly-this-is-a-work-of-art"&gt;We all love a misterious chart …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;In the period of the generative AI boom, everything is happening all at once. Every day a new model comes out, every week a billion dollar company teases a great new advancement.&lt;/p&gt;
&lt;p&gt;More often than not, they'll show us &lt;a href="https://www.anthropic.com/news/claude-3-family"&gt;some&lt;/a&gt; &lt;a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/"&gt;fancy&lt;/a&gt; &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;version&lt;/a&gt; of this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/75/0c345aa068fce110d80e914f8cd8898d.png" alt="Benchmark example" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.theverge.com/2024/10/28/24281965/honestly-this-is-a-work-of-art"&gt;We all love a misterious chart.&lt;/a&gt;  &lt;sup id="sf-what-the-hell-is-gqpa-anyway-1-back"&gt;&lt;a href="#sf-what-the-hell-is-gqpa-anyway-1" class="simple-footnote" title="Source"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;A healthy person might look at this and just think - great - the higher the score the better the model. Unfortunately - I'm &lt;a href="https://duarteocarmo.com/blog/the-marathon"&gt;not healthy&lt;/a&gt;, and I'm a bit sick we just throw these tables around like they're the absolute truth.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;40.9 score in GPQA&lt;/strong&gt; - what the &lt;em&gt;hell&lt;/em&gt; does that even mean?&lt;/p&gt;
&lt;h2 id="table-of-contents-and-code"&gt;Table of contents and code&lt;/h2&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#table-of-contents-and-code"&gt;Table of contents and code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-quick-intro-to-gpqa"&gt;A quick intro to GPQA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#trying-to-reproduce-gpqa"&gt;Trying to reproduce GPQA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#testing-eleutherais-lm-evaluation-harness"&gt;Testing EleutherAI's LM Evaluation Harness&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#running-gpqa-zero-shot-with-dspy"&gt;Running GPQA zero-shot with DsPy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-small-dspy-side-track"&gt;A small DSPy side-track&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#final-thoughts"&gt;Final thoughts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/duarteocarmo/3eeb2200af74880e034d45e63e30aafe"&gt;Code and prompts&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="a-quick-intro-to-gpqa"&gt;A quick intro to GPQA&lt;/h2&gt;
&lt;p&gt;GPQA stands for "Graduate-Level Google-Proof Q\&amp;amp;A Benchmark." It is a &lt;a href="https://huggingface.co/datasets/Idavidrein/gpqa"&gt;dataset&lt;/a&gt; and &lt;a href="https://github.com/idavidrein/gpqa"&gt;methodology&lt;/a&gt; introduced in a &lt;a href="https://arxiv.org/pdf/2311.12022"&gt;paper&lt;/a&gt; by researchers from NYU, Cohere, and Anthropic. In essence, GPQA focuses on a subset of exceptionally challenging questions from the fields of physics, chemistry, and biology.&lt;/p&gt;
&lt;p&gt;The dataset is available in three versions, each with varying levels of difficulty:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Extended Set&lt;/strong&gt;: 546 questions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Main Set&lt;/strong&gt;: 448 questions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Diamond Set&lt;/strong&gt;: 198 questions (the most challenging subset).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's an example of a question from the dataset:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Methylcyclopentadiene (which exists as a fluxional mixture of isomers) was allowed to react with methyl isoamyl ketone
and a catalytic amount of pyrrolidine. A bright yellow, cross-conjugated polyalkenyl hydrocarbon product formed (as
a mixture of isomers), with water as a side product. These products are derivatives of fulvene. This product was then
allowed to react with ethyl acrylate in a 1:1 ratio. Upon completion of the reaction, the bright yellow color had disappeared.
How many chemically distinct isomers make up the final product (not counting stereoisomers)?
A) 2
B) 16
C) 8
D) 4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For context, a 'non-expert' human-with unlimited time and internet access-achieves roughly 30.4% accuracy on the main set.&lt;/p&gt;
&lt;p&gt;With that in mind, it's intriguing to see how large language models (LLMs) are tested against these tasks. Fortunately, the prompts used for evaluation are openly shared in both the &lt;a href="https://arxiv.org/pdf/2311.12022"&gt;paper&lt;/a&gt; and the &lt;a href="https://github.com/idavidrein/gpqa"&gt;repository&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Zero-shot prompt:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;What is the correct answer to this question: What is the solubility of the stochiometric tricalcium
phosphate in a solution with the pH level of a) pH = 9, b) pH = 7 and c) pH = 5 (25 °C)?
The Ksp for Ca3(PO4)2 is 2𝑥10- 29, the dissociation constants of phosphoric acid are Ka1 =
7.5𝑥10- 3, Ka2 = 6.2𝑥10- 8 and Ka3 = 1.8𝑥10- 12.
Choices:
(A) 1.96𝑥10- 5; 1.35𝑥10- 4; 4.37𝑥10- 3
(B) 1.67𝑥10- 5; 1.51𝑥10- 4; 4.81𝑥10- 3
(C) 1.54𝑥10- 5𝑀; 1.42𝑥10- 4; 4.67𝑥10- 3
(D) 1.43𝑥10- 5; 1.29𝑥10- 4; 4.58𝑥10- 3

Format your response as follows: "The correct answer is (insert answer here)".
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Zero-shot chain-of-thought prompt:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;What is the correct answer to this question: What is the solubility of the stochiometric tricalcium
phosphate in a solution with the pH level of a) pH = 9, b) pH = 7 and c) pH = 5 (25 °C)?
The Ksp for Ca3(PO4)2 is 2𝑥10- 29, the dissociation constants of phosphoric acid are Ka1 =
7.5𝑥10- 3, Ka2 = 6.2𝑥10- 8 and Ka3 = 1.8𝑥10- 12.
Choices:
(A) 1.96𝑥10- 5; 1.35𝑥10- 4; 4.37𝑥10- 3
(B) 1.67𝑥10- 5; 1.51𝑥10- 4; 4.81𝑥10- 3
(C) 1.54𝑥10- 5𝑀; 1.42𝑥10- 4; 4.67𝑥10- 3
(D) 1.43𝑥10- 5; 1.29𝑥10- 4; 4.58𝑥10- 3
Let's think step by step:
...MODEL RESPONSE HERE...

Based on the above, what is the single, most likely answer choice? Answer in the format "The
correct answer is (insert answer here)".
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;A curious reader might notice that the 'zero-shot' chain-of-thought (COT) prompt actually involves two calls to the model: the first to generate a response and the second to finalize the answer. This approach is likely &lt;strong&gt;not&lt;/strong&gt; representative of how these models are typically used in real-world scenarios-but we'll explore that further in a moment.&lt;/p&gt;
&lt;p&gt;This raises an interesting question: how can we reproduce these results ourselves?&lt;/p&gt;
&lt;h2 id="trying-to-reproduce-gpqa"&gt;Trying to reproduce GPQA&lt;/h2&gt;
&lt;p&gt;If you've explored LLM evaluations, you're likely familiar with the &lt;a href="https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?params=-1%2C8\&amp;amp;official=true\&amp;amp;rankingMode=dynamic\&amp;amp;columns=rank%2Cmodel.type_icon%2Cid%2Cmodel.average_score%2Cmetadata.params_billions%2Cmetadata.submission_date%2Cevaluations.gpqa.normalized_score%2Cmetadata.base_model%2Cevaluations.mmlu_pro.normalized_score%2Cevaluations.musr.normalized_score"&gt;Open LLM Leaderboard&lt;/a&gt; by Hugging Face. The leaderboard recently received an update, allowing users to filter by 'Only Official Providers'-handy if you want to focus on well-known providers.&lt;/p&gt;
&lt;p&gt;Here's a snapshot of the top-performing models under 8 billion parameters, sorted by GPQA score:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/75/86a23c7296cb79b43c2eff265f26cf56.png" alt="Open LLM leaderboard" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;p&gt;You'll notice there are some familiar names in there like Phi, or Qwen 2.5 7B. Of course, models from OpenAI, Google or Claude are off this list, since it relates to Open models - which those aren't.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about"&gt;Digging a bit deeper&lt;/a&gt;, we notice two interesting things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;This leaderboad uses the zero-shot version of this dataset, and measures the normalized accuracy. It doesn't specify which variant of the dataset it uses - we'll assume the &lt;code&gt;main&lt;/code&gt; variant for now - but that could be wrong.&lt;/li&gt;
&lt;li&gt;Another intesting tid-bit is how they run these benchmarks: '&lt;em&gt;...using the &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness"&gt;Eleuther AI Language Model Evaluation Harness&lt;/a&gt; , a unified framework to test generative language models...&lt;/em&gt;'&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Great - so now we have at least &lt;em&gt;some&lt;/em&gt; clue of how to reproduce this.&lt;/p&gt;
&lt;h2 id="testing-eleutherais-lm-evaluation-harness"&gt;Testing EleutherAI's LM Evaluation Harness&lt;/h2&gt;
&lt;p&gt;There are &lt;a href="https://github.com/openai/simple-evals?tab=readme-ov-file"&gt;several frameworks&lt;/a&gt; for evaluating large language models (LLMs), including &lt;a href="https://github.com/openai/evals"&gt;OpenAI's evals&lt;/a&gt; and EleutherAI's &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness"&gt;LM Evaluation Harness&lt;/a&gt;. EleutherAI's framework, while powerful, can feel overwhelming at first. It supports a wide range of &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness?tab=readme-ov-file#model-apis-and-inference-servers"&gt;models and APIs&lt;/a&gt; and a significant number of &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks#tasks"&gt;tasks&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;After some poking around, I managed to find the &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/gpqa/README.md#tasks"&gt;part of the repo&lt;/a&gt; that relates the benchmark we're interested in.&lt;/p&gt;
&lt;p&gt;We can see that they support the following &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/gpqa#tasks"&gt;tasks&lt;/a&gt;:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;* `gpqa_{main, diamond, extended}_zeroshot`
* `gpqa_{main, diamond, extended}_n_shot`
* `gpqa_{main, diamond, extended}_generative_n_shot`
* `gpqa_{main, diamond, extended}_cot_zeroshot`
* `gpqa_{main, diamond, extended}_cot_n_shot`
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can also see the &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/gpqa/zeroshot/_gpqa_zeroshot_yaml#L9"&gt;config and prompt&lt;/a&gt; they use for the &lt;code&gt;gpqa_main_zeroshot&lt;/code&gt; task which is the one we're interested in. The attentive reader might notice already that &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/blob/bb098f13b05e361f01a5afe7b612779ce362b3f2/lm_eval/tasks/gpqa/zeroshot/_gpqa_zeroshot_yaml#L9"&gt;the prompt&lt;/a&gt; that is being used here is quite different from the one from the original paper. But let's ignore that for now.&lt;/p&gt;
&lt;p&gt;Here's how I tried to run the evaluation:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;lm-eval --model openai-chat-completions \
        --model_args model=gpt-4o-mini,num_concurrent=10,max_retries=3 \
        --tasks gpqa_main_zeroshot \
        --apply_chat_template \
        --output_path results \
        --log_samples \
        --wandb_args project=lmeval \
        --use_cache model_cache
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This errors out:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;NotImplementedError: Loglikelihood (and therefore `multiple_choice`-type tasks) is not supported for chat completions as OpenAI does not provide prompt logprobs. See https://github.com/EleutherAI/lm-evaluation-harness/issues/942#issuecomment-1777836312 or https://github.com/EleutherAI/lm-evaluation-harness/issues/1196 for more background on this limitation.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Oh wait? What's happening here? Well - most of these multiple-choice benchmarks use loglikelihood to understand what the choice is with the highest probability. This is a bit weird - since it's not the methodology outlined in the original GPQA paper. However, seems to be a deliberate choice to maintain the framework consistent.&lt;sup id="sf-what-the-hell-is-gqpa-anyway-2-back"&gt;&lt;a href="#sf-what-the-hell-is-gqpa-anyway-2" class="simple-footnote" title="See this and this."&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;However, there's another benchmark we could run and appears to work - the zero-shot chain-of-thought one.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;lm-eval --model openai-chat-completions \
        --model_args model=gpt-4o-mini,num_concurrent=10,max_retries=3 \
        --tasks gpqa_main_cot_zeroshot \ #&amp;lt;- cot version
        --apply_chat_template \
        --output_path results \
        --log_samples \
        --wandb_args project=lmeval \
        --use_cache model_cache
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This gives us:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;|        Tasks         |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|----------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gpqa_main_cot_zeroshot|      1|flexible-extract|     0|exact_match|↑  |0.1205|±  |0.0154|
|                      |       |strict-match    |     0|exact_match|↑  |0.0000|±  |0.0000|
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, I haven't dug too deeply into these results for the COT version, but they seem very low since they rely on the &lt;code&gt;flexible-extract&lt;/code&gt; metric, which &lt;em&gt;could&lt;/em&gt; perform poorly if it's regex-based. One thing we notice is that the &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/gpqa/cot_zeroshot/_gpqa_cot_zeroshot_yaml#L9"&gt;prompt&lt;/a&gt; used here is closer to the original one from the paper. However, it lacks the two-model-call approach outlined there.&lt;/p&gt;
&lt;p&gt;At this point, I was starting to lose faith in this entire evaluation system. What's the most straightforward way to achieve an interesting score on this benchmark? Something that can work for any model, whether open or closed?&lt;/p&gt;
&lt;h2 id="running-gpqa-zero-shot-with-dspy"&gt;Running GPQA zero-shot with DsPy&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://dspy.ai/"&gt;DSPy&lt;/a&gt; is a great framework out of Standford that provides some structure around prompting for LLMs. It also provides some great optimizers so we can programatically optimize a prompt without having to write much of it.&lt;/p&gt;
&lt;p&gt;We start by getting the data in:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;polars&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pl&lt;/span&gt;

&lt;span class="c1"&gt;# Load the dataset from Hugging Face and prepare it&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scan_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"hf://datasets/Idavidrein/gpqa/*_main.csv"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Question"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The next step is to configure our llm and the Signatures (think data-model) we'll use:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;lm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"openai/gpt-4o-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.00&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;lm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Signature&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Respond to the question with the correct answer"""&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OutputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;generate_answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Answer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here's an example of how to use it:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"What's the capital of Le Marche?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Ancona"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"Senigallia"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# prints Prediction(answer='Ancona')&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's juggle our data into a dataset:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dicts&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;choices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Correct Answer"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Incorrect Answer 1"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Incorrect Answer 2"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Incorrect Answer 3"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;example&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Example&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Question"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;correct_answer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Correct Answer"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_inputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"question"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EvaluateZeroShot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Signature&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;"""Evaluate the answer to the question"""&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;correct_answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;is_correct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OutputField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;evaluate_answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EvaluateZeroShot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;eval&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluate_answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;correct_answer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;correct_answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_correct&lt;/span&gt;

&lt;span class="n"&gt;evaluator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;devset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;display_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;display_table&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generate_answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;validate_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints 33.48&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Great, so a single run for GPQA gives us a 33.48% accuracy. Still a bit far off the 40.2% that is &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;advertised by OpenAI&lt;/a&gt; - but again - there's apparently no way we can effectively verify that.&lt;/p&gt;
&lt;h2 id="a-small-dspy-side-track"&gt;A small DSPy side-track&lt;/h2&gt;
&lt;p&gt;There's a couple of interesting experiments we run with DSPY.&lt;/p&gt;
&lt;p&gt;The first one, is to try a Chain-of-thought prompt (COT) to understand how that might influence our score:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;cot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainOfThought&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"question, choices -&amp;gt; answer"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;validate_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints 35.71&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That's a ~7% increase on our GPQA score. Probably just noise.&lt;/p&gt;
&lt;p&gt;Another interesting thing we could do, is to try to optimize our prompt for this specific GPQA benchmark. An important note is that we don't need to even fine-tune a model, we're just going to optimize our prompt.&lt;/p&gt;
&lt;p&gt;To do that, we'll use 98 questions that are in the extended version of the dataset, but not in the main version of it. Our goal is to optimize on those 98 and then see if an optimized prompt performs better on our main version.&lt;/p&gt;
&lt;p&gt;Here we go:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;df_extended&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scan_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"hf://datasets/Idavidrein/gpqa/*_extended.csv"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Question"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df_extended&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df_extended&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Question"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_in&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_main&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Question"&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="c1"&gt;# returns 98 questions&lt;/span&gt;

&lt;span class="c1"&gt;# build extended_set from df_extended (omitted for brevity)&lt;/span&gt;

&lt;span class="n"&gt;cot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ChainOfThought&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"question, choices -&amp;gt; answer"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# base format&lt;/span&gt;
&lt;span class="n"&gt;tp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dspy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MIPROv2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;validate_response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"light"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# optimization&lt;/span&gt;
&lt;span class="n"&gt;optimized_cot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trainset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;extended_set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# run optimization&lt;/span&gt;

&lt;span class="c1"&gt;# evaluate optimized prompt&lt;/span&gt;

&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized_cot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;validate_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Accuracy: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints 38.39&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This produces an &lt;a href="https://gist.github.com/duarteocarmo/3eeb2200af74880e034d45e63e30aafe#file-optimized_cot-json"&gt;optimized prompt&lt;/a&gt; with some few example questions. When evaluating, we get a further increased score of another 7.5%. As you can see - the results of these benchmark are HIGHLY prompt dependent. It seems to be possible to even optimize a much smaller model to be able to perform well on this 'benchmark'.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;Alright, after this deep dive, you should have a clearer idea of what GPQA is and how it works. Most benchmarks you see out there are variations of these multi-choice questions-some harder, some easier, some slightly different.&lt;/p&gt;
&lt;p&gt;But this exercise also revealed a few important things about the benchmark ecosystem and its flaws:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Benchmarks are not scientific.&lt;/strong&gt;
Many benchmarks feel more like pseudo-science than systematic evaluations. Frameworks often fail to replicate the original paper's prompts and methodology. If this inconsistency exists here, imagine how it plays out in other benchmarks. Using log-likelihood to evaluate multi-choice questions might make sense in some cases, but the GPQA paper doesn't mention it at all. So, what's really going on here?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompts matter-a lot.&lt;/strong&gt;
As we saw with DSPy, small tweaks in prompts can make a huge difference, sometimes boosting scores by 5-10%. Add a "you are a physics expert" to your prompt, and you might get even better results. This raises the question: how sensitive are these benchmarks to prompts? Are some models more affected than others? Definitely worth exploring.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The issue of trust.&lt;/strong&gt;
Benchmarks should inspire confidence in a model's performance, but instead, they often raise more questions than answers. For instance, what &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;does a 40.2 GPQA score for GPT-4o-mini really mean&lt;/a&gt;? What dataset version? What prompt? Without these details, it's hard to trust the results-especially when they're not reproducible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How can we evaluate?&lt;/strong&gt;
I once heard in a podcast: "There's nothing like a good vibe evaluation." Sure, when building products with LLMs, good &lt;a href="https://hamel.dev/blog/posts/evals/"&gt;evaluations&lt;/a&gt; matter. But when comparing models, questions of &lt;a href="https://lmarena.ai/"&gt;preference&lt;/a&gt; are often more valuable than multi-choice benchmarks. Efforts like the &lt;a href="https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard"&gt;OpenLLM leaderboard&lt;/a&gt; are interesting, but do they really tell us much about these models? With closed models, at least, the mystery adds intrigue.&lt;/p&gt;
&lt;p&gt;Benchmarks are fun, but in the end, the true test of a model is how well it performs in the real world. That's what matters.&lt;/p&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-what-the-hell-is-gqpa-anyway-1"&gt;&lt;a href="https://arxiv.org/pdf/2412.08905"&gt;Source&lt;/a&gt; &lt;a href="#sf-what-the-hell-is-gqpa-anyway-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-what-the-hell-is-gqpa-anyway-2"&gt;See &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/issues/942#issuecomment-1777836312"&gt;this&lt;/a&gt; and &lt;a href="https://github.com/EleutherAI/lm-evaluation-harness/issues/1196"&gt;this&lt;/a&gt;. &lt;a href="#sf-what-the-hell-is-gqpa-anyway-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>RAG tricks from the trenches</title><link href="https://duarteocarmo.com/blog/rag-tricks-from-the-trenches.html" rel="alternate"/><published>2024-12-30T00:00:00+01:00</published><updated>2024-12-30T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-12-30:/blog/rag-tricks-from-the-trenches.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/74/main.jpg" alt="Men fishing" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#some-context"&gt;Some context&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#our-data-a-controversial-dataset"&gt;Our data: A controversial dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-stupid-simple-vector-store"&gt;A stupid simple vector store&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#plain-rag"&gt;Plain RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#complicating-things-hybrid-search"&gt;Complicating things: Hybrid Search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#what-if-we-ask-for-a-summary"&gt;What if we ask for a summary?&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#detecting-representative-questions"&gt;Detecting representative questions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#answering-questions-that-require-the-entire-dataset-in-context"&gt;Answering questions that require the entire dataset in context&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#expanding-user-queries"&gt;Expanding user queries&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-most-important-lesson"&gt;The most important Lesson&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#resources-to-go-beyond-the-basics"&gt;Resources to go beyond the basics …&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/74/main.jpg" alt="Men fishing" style="max-width:100%;border-radius: 2px;"&gt;&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#some-context"&gt;Some context&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#our-data-a-controversial-dataset"&gt;Our data: A controversial dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-stupid-simple-vector-store"&gt;A stupid simple vector store&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#plain-rag"&gt;Plain RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#complicating-things-hybrid-search"&gt;Complicating things: Hybrid Search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#what-if-we-ask-for-a-summary"&gt;What if we ask for a summary?&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="#detecting-representative-questions"&gt;Detecting representative questions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#answering-questions-that-require-the-entire-dataset-in-context"&gt;Answering questions that require the entire dataset in context&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#expanding-user-queries"&gt;Expanding user queries&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-most-important-lesson"&gt;The most important Lesson&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#resources-to-go-beyond-the-basics"&gt;Resources to go beyond the basics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/duarteocarmo/79346dacdfcccbf0ed9184ddbaa49a32"&gt;Notebook download&lt;/a&gt; - &lt;a href="http://nbsanity.com/static/1637e83c083a69e81cfa19006502fced/summarization.html"&gt;nbsanity preview&lt;/a&gt;&lt;/p&gt;
&lt;!-- https://www.artic.edu/artworks/16837/tynemouth-priory-england --&gt;

&lt;h2 id="some-context"&gt;Some context&lt;/h2&gt;
&lt;p&gt;We had a database of 50M strings and I couldn't wait to embed them all. Embeddings for recommender systems were, &lt;em&gt;for a long time&lt;/em&gt;, the HOLY grail I longed for. I still remember, long before the RAG rage, having conversations with &lt;a href="https://www.parraguezr.net/"&gt;Pedro&lt;/a&gt; about the amazing types of applications we could build with embeddings.&lt;/p&gt;
&lt;p&gt;In a way - RAG has come a &lt;strong&gt;long&lt;/strong&gt; way. From another side: but RAG is still &lt;em&gt;the same&lt;/em&gt;. It's just search! Over embeddings.&lt;/p&gt;
&lt;p&gt;For the past 2/3 years, the number of applications doing some sort of RAG has increased significantly. I've learned a trick or two over that time.&lt;/p&gt;
&lt;p&gt;Here are some tricks I've learned along the way in hope someone else can benefit from them as well.&lt;/p&gt;
&lt;p&gt;Important notice: Almost &lt;em&gt;none&lt;/em&gt; of these are mine. And that's why I like them.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# !pip install polars jupyter_black fasttext huggingface_hub lancedb litellm sklearn sentence_transformers&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;polars&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;pl&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jupyter_black&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;fasttext&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;huggingface_hub&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hf_hub_download&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;lancedb&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lancedb.pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LanceModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;lancedb.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_registry&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;litellm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.cluster&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KMeans&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_random_seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;jupyter_black&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# CONFIGURATION&lt;/span&gt;

&lt;span class="n"&gt;EMBEDDING_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;all-MiniLM-L6-v2&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;LLM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;ollama_chat/llama3.1&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;skeets&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;DEVICE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;mps&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="our-data-a-controversial-dataset"&gt;Our data: A controversial dataset&lt;/h2&gt;
&lt;p&gt;Remember that &lt;a href="https://www.404media.co/someone-made-a-dataset-of-one-million-bluesky-posts-for-machine-learning-research/"&gt;guy that got pretty much banned from BlueSky&lt;/a&gt; for collecting a dataset of 1 Million skeets? Well, our dataset is a nice collection of not 1, but 2! &lt;a href="https://huggingface.co/datasets/alpindale/two-million-bluesky-posts"&gt;2 Million BlueSky posts&lt;/a&gt;. Isn't that controversial?&lt;/p&gt;
&lt;p&gt;I'm really enjoying Bluesky by the way. You should &lt;a href="https://bsky.app/profile/duarteocarmo.com"&gt;follow me&lt;/a&gt; there in case you come across this!&lt;/p&gt;
&lt;p&gt;Alright, let's load the dataset using Polars:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scan_ndjson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hf://datasets/alpindale/two-million-bluesky-posts/*.jsonl&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;reply_to&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_null&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;created_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Datetime&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;created_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%Y-%m&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alias&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;month_year&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;created_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;%Y&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alias&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;year&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50_000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# so that we run fast.&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="si"&gt;=}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Example skeets:&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dicts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;df.shape=(50000, 8)
Example skeets:
[{&amp;#39;text&amp;#39;: &amp;quot;The last time I was back in the U.S. to visit family for Thanksgiving was in 2008, shortly after Obama was elected. People were elated and optimistic. The future looked bright.\nToday it could hardly be darker. That&amp;#39;s what the MAGAs have brought us: chaotic dictatorship.\nI&amp;#39;ll stay home in Canada.&amp;quot;,
  &amp;#39;created_at&amp;#39;: datetime.datetime(2024, 11, 27, 15, 30, 22, 133000),
  &amp;#39;author&amp;#39;: &amp;#39;did:plc:cnxppns7qxo2wzdlutmxbkih&amp;#39;,
  &amp;#39;uri&amp;#39;: &amp;#39;at://did:plc:cnxppns7qxo2wzdlutmxbkih/app.bsky.feed.post/3lbwumhstsc2e&amp;#39;,
  &amp;#39;has_images&amp;#39;: False,
  &amp;#39;reply_to&amp;#39;: None,
  &amp;#39;month_year&amp;#39;: &amp;#39;2024-11&amp;#39;,
  &amp;#39;year&amp;#39;: &amp;#39;2024&amp;#39;},
 {&amp;#39;text&amp;#39;: &amp;#39;9 Top CSS Essential Skills That Every Web designer Should Learn – http://bit.ly/vVyTg #CSS #webdesign&amp;#39;,
  &amp;#39;created_at&amp;#39;: datetime.datetime(2009, 10, 17, 17, 32, 37),
  &amp;#39;author&amp;#39;: &amp;#39;did:plc:id2rbt2kysdnscdrzh5o5n7d&amp;#39;,
  &amp;#39;uri&amp;#39;: &amp;#39;at://did:plc:id2rbt2kysdnscdrzh5o5n7d/app.bsky.feed.post/3lbwmyaeat42n&amp;#39;,
  &amp;#39;has_images&amp;#39;: False,
  &amp;#39;reply_to&amp;#39;: None,
  &amp;#39;month_year&amp;#39;: &amp;#39;2009-10&amp;#39;,
  &amp;#39;year&amp;#39;: &amp;#39;2009&amp;#39;}]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Without much surprise, most of our posts (88%) are from November '24.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;min_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;created_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;max_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;created_at&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;total_months&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;min_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;min_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;min_date: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;min_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, max_date: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;total_months: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_months&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;month_year&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;month_year&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;descending&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;min_date: 2007-10-19 21:50:33, max_date: 2024-11-28 05:08:13.636000
total_months: 205
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;month_year&lt;/th&gt;
&lt;th&gt;proportion&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2024-11&lt;/td&gt;
&lt;td&gt;0.87644&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-10&lt;/td&gt;
&lt;td&gt;0.00028&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-09&lt;/td&gt;
&lt;td&gt;0.00028&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-08&lt;/td&gt;
&lt;td&gt;0.00038&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-07&lt;/td&gt;
&lt;td&gt;0.00072&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-06&lt;/td&gt;
&lt;td&gt;0.00032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-05&lt;/td&gt;
&lt;td&gt;0.00024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-04&lt;/td&gt;
&lt;td&gt;0.00036&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-03&lt;/td&gt;
&lt;td&gt;0.00032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-02&lt;/td&gt;
&lt;td&gt;0.00054&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Even though the dataset mentions to have a configuration with the language of each skeet, I didn't manage to find it.&lt;/p&gt;
&lt;p&gt;But to keep things simple, let's use the &lt;a href="https://github.com/cisnlp/GlotLID"&gt;same model&lt;/a&gt; they did (&lt;code&gt;glotlid&lt;/code&gt;) and keep skeets in English only.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;model_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hf_hub_download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cis-lmu/glotlid&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;model.bin&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fasttext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;text_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_list&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;text_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot; &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text_list&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;is_english&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;__label__eng_Latn&amp;quot;&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;with_columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_english&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UInt8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alias&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;is_english&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;total rows: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;is_english&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;total rows after filtering: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;total rows: 50000
total rows after filtering: 22528
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="a-stupid-simple-vector-store"&gt;A stupid simple vector store&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://lancedb.github.io/lancedb/"&gt;Lancedb&lt;/a&gt; provides a great interface that combines &lt;a href="https://sbert.net/"&gt;sentence-transformers&lt;/a&gt; + pydantic.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lancedb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/tmp/db&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;get_registry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;sentence-transformers&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EMBEDDING_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DEVICE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LanceModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SourceField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndims&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VectorField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;table_names&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TABLE_NAME&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Table was deleted and will be overwritten.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TABLE_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# this will automatically add and embed the items.&lt;/span&gt;

&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dicts&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_fts_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="plain-rag"&gt;Plain RAG&lt;/h2&gt;
&lt;p&gt;In it's most pure form. Retrieval Augmented Generation has 3 simple components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Retrieve context (e.g., search)&lt;/li&gt;
&lt;li&gt;Build prompt&lt;/li&gt;
&lt;li&gt;Answer question&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_pydantic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;context_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;context_str&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;- &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;======&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;* You are an expert at answering questions from the user given a context.&lt;/span&gt;
&lt;span class="s2"&gt;* Use the context section to answer the questions from the user.&lt;/span&gt;
&lt;span class="s2"&gt;* Answer the question directly. No BS.&lt;/span&gt;

&lt;span class="s2"&gt;&amp;lt;context&amp;gt;&lt;/span&gt;
&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;&amp;lt;/context&amp;gt;&lt;/span&gt;

&lt;span class="s2"&gt;&amp;lt;question&amp;gt;&lt;/span&gt;
&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;&amp;lt;/question&amp;gt;&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;PROMPT&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LLM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;choices&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;message&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What are the main reasons people are switching from twitter/X to bluesky?&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Question: &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Answer: &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# print(f&amp;quot;Context chunks:\n{context}&amp;quot;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Question:  What are the main reasons people are switching from twitter/X to bluesky?
Answer:
&amp;#39;Based on the context, the main reasons people are switching from Twitter/X to Bluesky seem to be:

1. A desire for a more focused and curated experience, as mentioned in &amp;quot;Trying to be way more active here than Twitter...&amp;quot; and &amp;quot;My favorite thing about Twitter was looking at &amp;#39;What&amp;#39;s Trending&amp;#39;... Does that exist in Bluesky?&amp;quot;
2. Frustration with the chaos and noise on Twitter/X, as implied by &amp;quot;I feel like I’ve been neglecting Bluesky because I been on threads and it’s no different from IG but with Twitter capabilities .. 🤦🏾‍♂️&amp;quot; and &amp;quot;Everything all of the time is not what I ever asked for.&amp;quot;
3. A sense of nostalgia for a more open and community-driven platform, as suggested by comparing Bluesky to Netscape in its early days.
4. A desire to support a more decentralized and less commercial social media platform, as implied by &amp;quot;i like to think of bluesky(tm) as like a netscape . wherein they end up spitting out something cool and popular and open (and - commercial) but do not really find long term business success from that.&amp;quot;&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Great, that works. Now, what can we do more?&lt;/p&gt;
&lt;h2 id="complicating-things-hybrid-search"&gt;Complicating things: Hybrid Search&lt;/h2&gt;
&lt;p&gt;LanceDB also provides a nice hybrid search option - allowing us to use full text search in combination with semantic search. In the default configuration, it will weight vector similarity and full text search around 70-30.&lt;/p&gt;
&lt;p&gt;All we need to do is add &lt;code&gt;query_type='hybrid'&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hybrid_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hybrid&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_pydantic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What are the main reasons people are switching from twitter/X to bluesky?&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hybrid_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Question: &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Answer: &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Question:  What are the main reasons people are switching from twitter/X to bluesky?
Answer:
&amp;#39;Based on the context, the main reasons people are switching from Twitter/X to Bluesky include:

* The ability to see what&amp;#39;s trending and have a better understanding of current events and public discourse
* Frustration with Twitter&amp;#39;s content and moderation policies (as implied by the comment &amp;quot;it doesn&amp;#39;t suck like x&amp;quot;)
* A desire for more nuanced and complex discussions, as evidenced by the quote about people being more complex than we want to assume
* The presence of internet trolls on Twitter, who are also migrating to Bluesky

These factors seem to be driving users to leave Twitter/X and join Bluesky.&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, keep in mind. There are already at least 2 parameters we would need to tune and evaluate here. (1) the number of chunks we want to stuff in the context (here, &lt;code&gt;top_k&lt;/code&gt;), and (2) the type of search we would like to conduct (hybrid, full-text, vector only).&lt;/p&gt;
&lt;p&gt;But if that was not complicated enough, we could complicate a bit more!&lt;/p&gt;
&lt;h2 id="what-if-we-ask-for-a-summary"&gt;What if we ask for a summary?&lt;/h2&gt;
&lt;p&gt;Let's look at the following question:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Give me a high level summary of the skeets&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If we take the naive approach of simply embedding this query, we are likely going to retrieve context with skeets that are similar to our question. At the extreme, this will surface skeets that contain the words high level summary and skeets.&lt;/p&gt;
&lt;p&gt;Now, this might be what we want, but likely isn't. What we want in this case is to provide more context to the LLM regarding topics that align with the overarching content of the skeets, and then let the LLM itself infer the high-level summary from the context.&lt;/p&gt;
&lt;p&gt;The way I think about this is the notion of filters. Here's a quick drawing that illustrates the concept:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/74/summarization.png" alt="RAG question classification" style="max-width:100%;border-radius: 4px"&gt;&lt;/p&gt;
&lt;p&gt;First of all, how do we detect if the user asks such a question?&lt;/p&gt;
&lt;h3 id="detecting-representative-questions"&gt;Detecting representative questions&lt;/h3&gt;
&lt;p&gt;The most basic approach to detecting "representative" questions, is to use an LLM itself to handle them:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;INTENT_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;    Tell me which filter we should use for the given user question.&lt;/span&gt;
&lt;span class="s2"&gt;    The keyword filter filters data based on keywords from the question. This is good for specific questions or when you want to focus on a particular topic.&lt;/span&gt;
&lt;span class="s2"&gt;    The representative filter returns a representative sample of insights, from which you can infer the answer to the question. This is good for questions that relate to the entire dataset (trends, summaries, etc)&lt;/span&gt;
&lt;span class="s2"&gt;    Respond only with the type of filter to use!&lt;/span&gt;

&lt;span class="s2"&gt;    Examples:&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;Summarize the data&amp;#39; -&amp;gt; Representative filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;What are the main insights?&amp;#39; -&amp;gt; Representative filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;What are the high-level discussion points from our field reps regarding NOS ?&amp;#39; -&amp;gt; Keyword filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;Describe the negative, neutral and positive perception of physicians for Benuron?&amp;#39; -&amp;gt; Keyword filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;What are the main trends from the following skeets?&amp;#39; -&amp;gt; Representative filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;What are the main trends?&amp;#39; -&amp;gt; Representative filter&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;#39;What are the key tweets I should be aware of?&amp;#39; -&amp;gt; Representative filter&lt;/span&gt;

&lt;span class="s2"&gt;    Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;INTENT_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Summarize the skeets&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;What are the main trends in this dataset?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;What&amp;#39;s the current weather in Lisbon?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;Why are people leaving Twitter/X?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&amp;#39;Summarize the skeets&amp;#39; Representative filter
&amp;#39;What are the main trends in this dataset?&amp;#39; Representative filter
&amp;#39;What&amp;#39;s the current weather in Lisbon?&amp;#39; Keyword filter
&amp;#39;Why are people leaving Twitter/X?&amp;#39; Keyword filter
&amp;#39;Give me a high level summary of the skeets&amp;#39; Representative filter
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The LLM manages to understand if we need a general overview or a specific context quite well.&lt;/p&gt;
&lt;h3 id="answering-questions-that-require-the-entire-dataset-in-context"&gt;Answering questions that require the entire dataset in context&lt;/h3&gt;
&lt;p&gt;There's an interesting article that dives a bit more into this topic &lt;a href="https://pashpashpash.substack.com/p/tackling-the-challenge-of-document"&gt;here&lt;/a&gt;. One idea I stole from there is the idea of passing a 'representative' sample of our dataset to the context.&lt;/p&gt;
&lt;p&gt;Now how can we build a representative sample? Here's one idea:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The user asks a general question that requires the whole dataset in context&lt;/li&gt;
&lt;li&gt;We know we can stuff a maximum of &lt;code&gt;Z&lt;/code&gt; chunks into the context&lt;/li&gt;
&lt;li&gt;We retrieve a large number of embeddings from our dataset at random&lt;/li&gt;
&lt;li&gt;We run K-means clustering or &lt;a href="https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html"&gt;something more fancy&lt;/a&gt; to cluster those embeddings into &lt;code&gt;N&lt;/code&gt; clusters/topics&lt;/li&gt;
&lt;li&gt;From each of those topics we draw &lt;code&gt;n&lt;/code&gt; items so that &lt;code&gt;n * N ~= Z&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;We stuff those into the context&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;question&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&amp;#39;Give me a high level summary of the skeets&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_representative_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_items_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# max number of items to stuff in the context&lt;/span&gt;
    &lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# number of clusters to create&lt;/span&gt;
    &lt;span class="n"&gt;n_items_initial_retrieval&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# number of items to retrieve from the database&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Retrieves a representative sample of items from the database to be used as context.&lt;/span&gt;

&lt;span class="sd"&gt;    Args:&lt;/span&gt;
&lt;span class="sd"&gt;        max_items_context (int): The maximum number of items to include in the context.&lt;/span&gt;
&lt;span class="sd"&gt;        n_clusters (int): The number of clusters to create.&lt;/span&gt;
&lt;span class="sd"&gt;        n_items_initial_retrieval (int): The number of items to retrieve from the database.&lt;/span&gt;

&lt;span class="sd"&gt;    Returns:&lt;/span&gt;
&lt;span class="sd"&gt;        list[Skeet]: A list of items from the database.&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;items_per_cluster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_items_context&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;items_per_cluster&lt;/span&gt;&lt;span class="si"&gt;=}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;results_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_pandas&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n_items_initial_retrieval&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;results_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_items_initial_retrieval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;records&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;records&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;results_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Skeet&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model_validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;kmeans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KMeans&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;kmeans&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;cluster_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kmeans&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;labels_&lt;/span&gt;

    &lt;span class="n"&gt;context_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;cluster_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_labels&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="n"&gt;cluster_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;item&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cluster_labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;cluster_id&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;context_items&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;items_per_cluster&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;context_items&lt;/span&gt;

&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_representative_context&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Based on the provided context, it appears that there are various &amp;quot;skeets&amp;quot; ( likely referring to Twitter threads or short conversations) scattered throughout the text. However, without further information, I can only provide a general high-level summary:

The skeets seem to be a collection of miscellaneous conversations and comments from various individuals on topics such as:

* Music and celebrities (e.g., Drake, Kendrick)
* Gaming and entertainment (e.g., League of Legends, Arcane)
* Politics and social issues (e.g., Trump, immigration)
* Pop culture references (e.g., references to movies, TV shows, or memes)
* Personal experiences and anecdotes (e.g., someone&amp;#39;s spa date for their John Deere tractor)

These conversations seem to be a mix of humorous, sarcastic, and serious discussions. If you&amp;#39;d like me to provide more specific information about any particular skeet, please let me know!
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There are a lot of ways we could make this better. But this is the basic idea.&lt;/p&gt;
&lt;h2 id="expanding-user-queries"&gt;Expanding user queries&lt;/h2&gt;
&lt;p&gt;Another interesting concept is the concept of query expansion. I read this one in the &lt;a href="https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072"&gt;&lt;em&gt;LLM Engineer's Handbook&lt;/em&gt;&lt;/a&gt;. It's a great book - you should read it to!&lt;/p&gt;
&lt;p&gt;In this case we use an LLM to expand the user query in hope of retrieving even more relevant items into our context. Here's how it works:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;expand_query_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expand_to_n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;

    &lt;span class="c1"&gt;# could be much better by using json mode, structured outputs, or any of these: https://simmering.dev/blog/structured_output/&lt;/span&gt;
    &lt;span class="n"&gt;EXPAND_QUERY_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;* You are an AI language model assistant.&lt;/span&gt;
&lt;span class="s2"&gt;* Your task is to generate &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expand_to_n&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; different versions of the given user question to retrieve relevant documents from a vector database.&lt;/span&gt;
&lt;span class="s2"&gt;* By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search.&lt;/span&gt;
&lt;span class="s2"&gt;* Return a json string with a &amp;#39;questions&amp;#39; key, which is a list of strings. It should be parseable by json.loads in Python.&lt;/span&gt;
&lt;span class="s2"&gt;* IMPORTANT: Do not include ```json or any other text in the response.&lt;/span&gt;

&lt;span class="s2"&gt;Original question:&lt;/span&gt;

&lt;span class="s2"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXPAND_QUERY_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;questions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;questions&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt;

&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What are the main reasons people are switching from twitter/X to bluesky?&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;expanded_questions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expand_query_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expanded_questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;- &amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;- &amp;#39;Why did users leave Twitter and join Bluesky?&amp;#39;
- &amp;#39;What factors contribute to the migration of users from X (formerly Twitter) to Bluesky?&amp;#39;
- &amp;#39;Identify the key drivers behind the shift from Twitter to Bluesky, based on user behavior and preferences&amp;#39;
- &amp;#39;What are the primary reasons users are abandoning their old social media accounts on Twitter/X and joining a new platform like Bluesky?&amp;#39;
- &amp;#39;How does the user experience and feature set of Bluesky differ from that of Twitter/X, leading to increased adoption?&amp;#39;
- &amp;#39;What are the main reasons people are switching from twitter/X to bluesky?&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can now take these questions, do a retrieval for each one of them, and answer the question.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expanded_questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Answer:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Question: What are the main reasons people are switching from twitter/X to bluesky?
Answer:
Based on the context provided, the main reasons people are switching from Twitter/X to Bluesky include:

1. **Frustration with Twitter/X**: Many users express their dissatisfaction with the current state of Twitter/X, mentioning issues like transphobia, rampant engagement farming, and the platform&amp;#39;s overall toxic environment.
2. **Desire for a more positive experience**: Users are seeking an alternative where they can engage with others in a more constructive and respectful manner. This is evident from statements like &amp;quot;it doesn&amp;#39;t suck like x (twitter)&amp;quot; and &amp;quot;y&amp;#39;all are lovely.&amp;quot;
3. **Wish to preserve the original spirit of Twitter**: Some users nostalgically recall the early days of Twitter, when it was more open, popular, and user-friendly. They hope Bluesky will emulate this spirit.
4. **Attractiveness of new features or improvements**: Although not explicitly stated, some users might be drawn to specific features or improvements that Bluesky offers compared to Twitter/X.

Overall, the main reasons people are switching from Twitter/X to Bluesky seem to be a combination of dissatisfaction with the current state of the former and a desire for a better alternative.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="the-most-important-lesson"&gt;The most important Lesson&lt;/h2&gt;
&lt;p&gt;These are all fancy and nice. But make sure to always start with a baseline. &lt;a href="https://pypi.org/project/rank-bm25/"&gt;A simple one.&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="resources-to-go-beyond-the-basics"&gt;Resources to go beyond the basics&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://jxnl.co/writing/category/rag/"&gt;Most things by Jason Liu are great&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arcturus-labs.com/blog/category/retrieval/"&gt;Arcturus labs has some good writing on the topic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/tags/rag/"&gt;Simon Willison's entries are always worth a skim/read&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072"&gt;LLM Engineer's handbook from Paul Iusztin and Maxime Labonne&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.manning.com/books/relevant-search"&gt;I'm currently reading relevant search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.pamelafox.org/2024/09/integrating-vision-into-rag-applications.html"&gt;Handling vision in RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://youtu.be/0nA5QG3087g?si=y_JmyCM_4vc59l62"&gt;This talk by Ben Clavié&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://parlance-labs.com/education/rag/"&gt;Most of these talks from Parlance labs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.ragas.io/en/stable/"&gt;Evaluate rag with RAGAS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1"&gt;This blog post by Goku Mohandas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.infoq.com/news/2023/10/practical-advice-RAG/"&gt;This talk by Sam Partee&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="blog"/></entry><entry><title>An ode to the Marathon</title><link href="https://duarteocarmo.com/blog/the-marathon.html" rel="alternate"/><published>2024-12-17T00:00:00+01:00</published><updated>2024-12-17T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-12-17:/blog/the-marathon.html</id><summary type="html">&lt;p&gt;I'm flying over the Atlantic just between France and Spain for my yearly Christmas trip back to Lisbon. For some reason, I just remembered the blog post I always wanted to write. This one is not about tech, but it's about something I'm equally passionate about: The Marathon.&lt;/p&gt;
&lt;p&gt;Funnily - I'm …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'm flying over the Atlantic just between France and Spain for my yearly Christmas trip back to Lisbon. For some reason, I just remembered the blog post I always wanted to write. This one is not about tech, but it's about something I'm equally passionate about: The Marathon.&lt;/p&gt;
&lt;p&gt;Funnily - I'm flying pretty close from where my last battle went down: right off the beautiful Cote d'Azure, about 1 month ago, for the &lt;a href="https://cotedazurfrance.com/discover/the-top-events-and-festivals-on-the-cote-dazur/the-alpes-maritimes-nice-cannes-marathon/"&gt;Nice Marathon&lt;/a&gt;, my seventh in ~3 years.&lt;/p&gt;
&lt;p&gt;And if that makes you think that Duarte is probably addicted. You're probably right. But I would argue I'm not addicted to the Marathon itself. I'm addicted to the process.&lt;/p&gt;
&lt;p&gt;Let me explain.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;"&lt;em&gt;What's your shoe rotation?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;"&lt;em&gt;What gear should I buy?&lt;/em&gt;"&lt;/li&gt;
&lt;li&gt;"&lt;em&gt;Should I join a run club?&lt;/em&gt;"&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of these miss the point of running completely. Running is a solo sport. And a marathon prep is a tough mental challenge to do well. "&lt;em&gt;I have to work&lt;/em&gt;", "&lt;em&gt;I won't make it this time&lt;/em&gt;", "&lt;em&gt;I'm feeling tired today&lt;/em&gt;". These are just a small taste of the challenge you are really up against: yourself.&lt;/p&gt;
&lt;p&gt;Why would you even? Why would you get up when it's dark and wet outside? Why would you go out for a run when it's the 17th of December on the windy coastline of Denmark? Because you know what lies ahead.&lt;/p&gt;
&lt;p&gt;After those dark and lonely training months the day &lt;em&gt;finally&lt;/em&gt; gets here. Most of us have a superstitious pre-race routine. Something that helps us re-gain the mental strength that things will go according to plan (spoiler alert: They never do). Mine is pretty simple: (1) lay the kit in the ground near the bed, and (2) make sure you have half a dose of Banana chocolate chip &lt;a href="https://www.ekosport.pt/overstims-gatosport-banane-pepites-chocolat-p-9-74741"&gt;Gatosport&lt;/a&gt; ready for next morning.&lt;/p&gt;
&lt;p&gt;Nerves are cracking. Thousands of people lined up with you. If you look closely you can tell: the more to the front people are, the more nervous they are. This is the moment you have been waiting for. There's music, there's announcements, there's noise. A lot of people dance and get excited, I can't bear myself to. The nerves are just too much.&lt;/p&gt;
&lt;p&gt;The start of a Marathon gives me of a feeling I only got in Highschool, when the teacher was walking around the classroom, grading everyone one by one. I was always nervous waiting to hear if I had gotten more than a 10 out of 20.&lt;/p&gt;
&lt;p&gt;All this bullshit, all this music, announcements, all of this dancing around the fire. Now we're going. Some will say the &lt;a href="/blog/run-every-day.html"&gt;inspiration comes from within&lt;/a&gt;. For the first two-thirds of a Marathon, I often find myself looking sideways. Not because I'm &lt;em&gt;comparing&lt;/em&gt; myself to them. But because I can CLEARLY see it! I can see the &lt;em&gt;exact&lt;/em&gt; same pain I'm experiencing. It's a rare moment to be surrounded by a group of people all experiencing the same painful experience at the exact same time. You don't know why they are suffering, why they decided to to this, where they are from, or what they're called - but you know what they're thinking. The same as you.&lt;/p&gt;
&lt;p&gt;Now you've done the hardest part. You've run for 30 Km. You've trained for this: You're ready. Just some of the things I wish I told myself right around that section. The actual thoughts are a mix of "&lt;em&gt;I don't want to do this&lt;/em&gt;", "&lt;em&gt;I should've trained harder&lt;/em&gt;", "&lt;em&gt;I can walk a bit - what's wrong with that&lt;/em&gt;", "&lt;em&gt;I will never run a Marathon again&lt;/em&gt;". Welcome to the wall.&lt;/p&gt;
&lt;p&gt;Even having run a 'respectable' amount of Marathons, the feeling never goes away. Hopefully, you'll plow through. Even if you do start slowing down, even if you lose the pacer or group, you'll keep going.&lt;/p&gt;
&lt;p&gt;"&lt;em&gt;You've already ran 40K! You're not going to stop now!&lt;/em&gt;" I often remember that American lady shouting at me from the sideline in Vienna some years ago. She was most definitely right.&lt;/p&gt;
&lt;p&gt;Yes, those 2 Km now feel like 200. Yes, your legs are tired. But you can see it! You can see that finish line. You made it! Weirdly, you can sprint now; all the pain goes away. It's the culmination of months of effort and hours of pain. This is where you'll probably shed a tear and celebrate with your loved ones.&lt;/p&gt;
&lt;p&gt;Personal-best or not, perfect-race or not. You've done more than 99% of people that day. And that's something to be proud of.&lt;/p&gt;
&lt;p&gt;Now comes the hangover. For some it's a week, for some it's 6 months, for others, it lasts years. No running. No exercise. You've done your part. You've done well. But then… Slowly… You start thinking about it. "&lt;em&gt;It can't be always that painful&lt;/em&gt;", "&lt;em&gt;There are at least 2 or 3 things I could've done better!&lt;/em&gt;", "&lt;em&gt;I'm running already, I should probably do a race...&lt;/em&gt;"&lt;/p&gt;
&lt;p&gt;And that's when you book your next one.&lt;/p&gt;
&lt;p&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/73/marathon_single.jpg" alt="Vienna Marathon" style="max-width:50%;border-radius: 4px;filter: grayscale(100%)"&gt;
&lt;/center&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Upgrading this website's podcast with F5-TTS</title><link href="https://duarteocarmo.com/blog/podcast-tts-f5-tts-python.html" rel="alternate"/><published>2024-11-12T00:00:00+01:00</published><updated>2024-11-12T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-11-12:/blog/podcast-tts-f5-tts-python.html</id><summary type="html">&lt;p&gt;For the &lt;a href="https://duarteocarmo.com/blog/you-can-now-listen-to-this-blog.html"&gt;past year&lt;/a&gt;, this website's &lt;a href="https://podcasts.apple.com/us/podcast/duarte-o-carmos-articles/id1719493997"&gt;podcast companion&lt;/a&gt; has been running on a text-to-speech model called &lt;a href="https://huggingface.co/coqui/XTTS-v2"&gt;XTTS-v2&lt;/a&gt;. It's not &lt;em&gt;horrible&lt;/em&gt;. And those who have heard my voice before might notice some similarities. But it's far from a &lt;em&gt;pleasant&lt;/em&gt; listenning experience.&lt;/p&gt;
&lt;p&gt;But the world of text-to-speech (TTS) has been gradually moving …&lt;/p&gt;</summary><content type="html">&lt;p&gt;For the &lt;a href="https://duarteocarmo.com/blog/you-can-now-listen-to-this-blog.html"&gt;past year&lt;/a&gt;, this website's &lt;a href="https://podcasts.apple.com/us/podcast/duarte-o-carmos-articles/id1719493997"&gt;podcast companion&lt;/a&gt; has been running on a text-to-speech model called &lt;a href="https://huggingface.co/coqui/XTTS-v2"&gt;XTTS-v2&lt;/a&gt;. It's not &lt;em&gt;horrible&lt;/em&gt;. And those who have heard my voice before might notice some similarities. But it's far from a &lt;em&gt;pleasant&lt;/em&gt; listenning experience.&lt;/p&gt;
&lt;p&gt;But the world of text-to-speech (TTS) has been gradually moving along. &lt;a href="https://notebooklm.google/"&gt;NotebookLM&lt;/a&gt; made headlines, and everyone is sure the future of AI will sit somewhere between agents (whatever those are) and text-to-speech. NotebookLM is an awesome product. But it's still, at its core, a closed source product&lt;sup id="sf-podcast-tts-f5-tts-python-1-back"&gt;&lt;a href="#sf-podcast-tts-f5-tts-python-1" class="simple-footnote" title="There are some open source implementations the core tech is still very much closed source."&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Unfortunately, for state of the art text-to-speech there is still a single undisputed champion out there: &lt;a href="https://elevenlabs.io/"&gt;ElevenLabs&lt;/a&gt;. Or should I say, there &lt;em&gt;was&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;Late this year a new model from Microsoft came out, called &lt;a href="https://arxiv.org/abs/2406.18009"&gt;E2-TTS&lt;/a&gt;. Sometime after, &lt;a href="https://github.com/SWivid/F5-TTS"&gt;F5-TTS&lt;/a&gt; was released - an improvement over E2 focued on inference and optimization. I was amazed by the &lt;a href="https://www.microsoft.com/en-us/research/project/e2-tts/"&gt;demos&lt;/a&gt; - they sounded &lt;em&gt;crisp&lt;/em&gt; and &lt;em&gt;natural&lt;/em&gt;. After browsing the code for a little bit, it seemed like an easy thing to test out.&lt;/p&gt;
&lt;p&gt;I tested the engine with &lt;a href="https://duarteocarmo.com/blog/around-iceland-6-days-camping-itinerary.html"&gt;one of my latest posts&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/coqui/XTTS-v2"&gt;XTTS-v2&lt;/a&gt;&lt;/strong&gt;
&lt;audio controls style="width: 100%; display: block" preload="metadata"&gt;&lt;source src="https://r2.duarteocarmo.com/old/old.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/SWivid/F5-TTS"&gt;F5-TTS&lt;/a&gt;&lt;/strong&gt;
&lt;audio controls style="width: 100%; display: block" preload="metadata"&gt;&lt;source src="https://r2.duarteocarmo.com/transcripts/10bb68f8f7200f19b71bb095af3c5f32.mp3" type="audio/mpeg"&gt;&lt;/audio&gt;&lt;/p&gt;
&lt;p&gt;I mean, would you just hear that. No more weird accelerations and squeeky voices. The new model is just much more stable and reliable than the previous one. It's light day and night! It's still not perfect though. One might argue the voice gets a bit &lt;em&gt;too&lt;/em&gt; intense some times. The problem with prounounciation persists. It still doesn't know how to say my name properly - but then again - most people also can't.&lt;/p&gt;
&lt;p&gt;I tried getting around the prounounciation by specifying the phoneme sequence&lt;sup id="sf-podcast-tts-f5-tts-python-2-back"&gt;&lt;a href="#sf-podcast-tts-f5-tts-python-2" class="simple-footnote" title="See the demo page for details"&gt;2&lt;/a&gt;&lt;/sup&gt;, but that didn't seem to change much. It did give me a good idea to &lt;a href="https://github.com/duarteocarmo/podcaster/blob/master/src/podcaster/parser.py#L109"&gt;preprocess&lt;/a&gt; the audio with an LLM - and that also makes things smoother.&lt;/p&gt;
&lt;p&gt;Running F5-TTS was also trivial thanks to Modal's great product (still haven't paid a penny I must say, but would happily). Only this specific function needs to run on a GPU, and it does so just by using &lt;a href="https://modal.com/docs/guide/gpu"&gt;Modal's decorator&lt;/a&gt;:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;gpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODAL_GPU&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mounts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;modal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mount&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_local_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;LOCAL_DATA_DIR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remote_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODAL_REMOTE_DATA_DIR&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transcribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ParsedArticle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;reference_voice_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;REFERENCE_VOICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;reference_text_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;REFERENCE_TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reference_text_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"r"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;voice_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;reference_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;voice_text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;"f5-tts_infer-cli"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"--model"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"--ref_audio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;reference_voice_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"--ref_text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;reference_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"--gen_text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text_for_tts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;returncode&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;target_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tests/infer_cli_out.wav"&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;"File &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;target_file&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; does not exist."&lt;/span&gt;

    &lt;span class="n"&gt;mp3_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;convert_to_mp3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mp3_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"rb"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;audio_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;audio_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;audio_file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;audio_bytes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If you're interested in thre rest of the code, &lt;a href="https://github.com/duarteocarmo/podcaster"&gt;Podcaster&lt;/a&gt; is free and open source.&lt;/p&gt;
&lt;hr&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-podcast-tts-f5-tts-python-1"&gt;There are &lt;a href="https://github.com/lucidrains/soundstorm-pytorch"&gt;some&lt;/a&gt; &lt;a href="https://github.com/meta-llama/llama-recipes/tree/main/recipes/quickstart/NotebookLlama"&gt;open&lt;/a&gt; &lt;a href="https://huggingface.co/spaces/gabrielchua/open-notebooklm"&gt;source&lt;/a&gt; implementations the core tech is still very much closed source. &lt;a href="#sf-podcast-tts-f5-tts-python-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;li id="sf-podcast-tts-f5-tts-python-2"&gt;See the &lt;a href="https://www.microsoft.com/en-us/research/project/e2-tts/"&gt;demo page&lt;/a&gt; for details &lt;a href="#sf-podcast-tts-f5-tts-python-2-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>What if ChatGPT's memory was a knowledge graph?</title><link href="https://knowledgegraphchat.duarteocarmo.com/" rel="alternate"/><published>2024-10-29T00:00:00+01:00</published><updated>2024-10-29T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:knowledgegraphchat.duarteocarmo.com,2024-10-29:/</id><content type="html"/><category term="blog"/></entry><entry><title>Classification in the age of LLMs: The emoji problem</title><link href="https://duarteocarmo.com/blog/classification-llms-emoji-open-source-ollama.html" rel="alternate"/><published>2024-10-07T00:00:00+02:00</published><updated>2024-10-07T00:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-10-07:/blog/classification-llms-emoji-open-source-ollama.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/70/header_169.png" alt="Fruits and bowl" style="max-width:100%;border-radius: 2px;"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;For the past years Vitto and I have used &lt;a href="https://tricount.com/"&gt;Tricount&lt;/a&gt; to track our shared expenses. The app is actually pretty good, but there’s one small thing that annoys me quite a bit.&lt;/p&gt;
&lt;p&gt;Even though we spoke English to each other for the first month, we’ve since spoken a …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/70/header_169.png" alt="Fruits and bowl" style="max-width:100%;border-radius: 2px;"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;For the past years Vitto and I have used &lt;a href="https://tricount.com/"&gt;Tricount&lt;/a&gt; to track our shared expenses. The app is actually pretty good, but there’s one small thing that annoys me quite a bit.&lt;/p&gt;
&lt;p&gt;Even though we spoke English to each other for the first month, we’ve since spoken a mix of Italian and Portuguese. And when I say &lt;em&gt;a mix&lt;/em&gt;, I really mean &lt;em&gt;a mix&lt;/em&gt;. I mean, would you just look at our expense tracking app:&lt;/p&gt;
&lt;div style="display: flex; justify-content: center; gap: 10px; width: 100%; max-width: 100%;"&gt;
  &lt;img src="https://duarteocarmo.com/images/70/tricount_1.png" alt="Image 1" style="max-width: calc(50% - 5px); height: auto; flex: 1;"&gt;
  &lt;img src="https://duarteocarmo.com/images/70/tricount_2.png" alt="Image 2" style="max-width: calc(50% - 5px); height: auto; flex: 1;"&gt;
&lt;/div&gt;

&lt;p&gt;For the average calm and relaxed reader, this might not seem like much. But for us, other folk, it’s extremely annoying to see how it misses assigning the right emojis to most of our transactions! It gets Beers (🍺), it gets when we speak English and write something like “Train” (🚂). But whenever we speak something in Portuguese or Italian, BAM, there comes the generic 💶 emoji.&lt;/p&gt;
&lt;p&gt;Now. Is this a big problem? No. Is it an annoying problem? Yes. Is this a solved problem? &lt;em&gt;Probably&lt;/em&gt;. But in the age of LLMs and 400 Billion parameters models, what options do we have?&lt;/p&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#creating-a-labeled-dataset-from-screenshots"&gt;Creating a labeled dataset from screenshots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#classification-using-gpt-4o-and-structured-outputs"&gt;Classification using gpt-4o and structured outputs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#function-calling-local-models-with-ollama"&gt;Function calling local models with Ollama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#embeddings-for-classification"&gt;Embeddings for classification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#benchmarking-expense-categorization"&gt;Benchmarking expense categorization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#closing-thoughts"&gt;Closing thoughts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#epilogue-just-get-the-right-token"&gt;Epilogue: Just get the right token&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="creating-a-labeled-dataset-from-screenshots"&gt;Creating a labeled dataset from screenshots&lt;/h2&gt;
&lt;p&gt;Before looking at all the ways the Tricount emoji classification could be better, we should first gather a reasonable dataset. What before required some &lt;em&gt;grueling&lt;/em&gt; hand labelling or some clunky OCR library, now ‘only’ requires a couple of calls to &lt;code&gt;gpt-4o&lt;/code&gt;.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Define data structure&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ExpenseItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;payer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Expenses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;expenses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ExpenseItem&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Encode the images in base64&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;encode_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;rb&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;images_content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;image_url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;image_url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;data:image/png;base64,&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;encode_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ima&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ima&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;../tricount_pictures&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;*.PNG&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Classify&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="s2"&gt;&amp;quot;type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;List all the expenses in a json format&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;images_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Expenses&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;expenses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;

&lt;span class="c1"&gt;# prints&lt;/span&gt;

&lt;span class="c1"&gt;#[ExpenseItem(name=&amp;#39;Beers&amp;#39;, amount=XXX, currency=&amp;#39;DKK&amp;#39;, payer=&amp;#39;Dudu (me)&amp;#39;, emoji=&amp;#39;🍺&amp;#39;),&lt;/span&gt;

&lt;span class="c1"&gt;# ExpenseItem(name=&amp;#39;Beers&amp;#39;, amount=XXX, currency=&amp;#39;DKK&amp;#39;, payer=&amp;#39;Dudu (me)&amp;#39;, emoji=&amp;#39;🍺&amp;#39;),&lt;/span&gt;

&lt;span class="c1"&gt;# ExpenseItem(name=&amp;#39;Decatlon comun&amp;#39;, amount=XXX, currency=&amp;#39;DKK&amp;#39;, payer=&amp;#39;Vitto&amp;#39;, emoji=&amp;#39;💶&amp;#39;),&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Just liked that, we now have a dataset of ~50 expenses. Great. Let’s get to classifying.&lt;/p&gt;
&lt;h2 id="classification-using-gpt-4o-and-structured-outputs"&gt;Classification using &lt;code&gt;gpt-4o&lt;/code&gt; and structured outputs&lt;/h2&gt;
&lt;p&gt;Before we do anything else, how do the &lt;em&gt;big guns&lt;/em&gt; perform? What if we just ask &lt;code&gt;gpt-4o&lt;/code&gt; to classify? This is the simplest of options, even simpler now that OpenAI directly supports Pydantic base models via structured outputs. We can solve this problem in a couple dozen lines of code:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# define pydantic model&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    A single emoji that describes a financial transaction (e.g., Airplane ticket -&amp;gt; ✈️)&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The emoji that describes the transaction (Must be a single character!)&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@field_validator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;emoji&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_emoji&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;cls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_emoji&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;The emoji must be a single character! Received: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;

&lt;span class="c1"&gt;# classification function&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Classify this transaction: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;gpt-4o-mini-2024-07-18&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;

&lt;span class="n"&gt;classify_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Treno Milano Ancona&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# prints &amp;#39;🚆&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That was easy. But also not particularly exciting. What if our data is sensitive? What if our API gets extremely popular? What if we just want to handle it &lt;em&gt;ourselves&lt;/em&gt;?  Could we do the same with some off-the-shelf open-source models?&lt;/p&gt;
&lt;h2 id="function-calling-local-models-with-ollama"&gt;Function calling local models with Ollama&lt;/h2&gt;
&lt;p&gt;I would love to tell you that open-source function calling was a solved problem. And yes - the &lt;a href="https://gorilla.cs.berkeley.edu/"&gt;function calling leaderboard&lt;/a&gt; certainly seems to tell that story. But the reality of local models is &lt;em&gt;a bit&lt;/em&gt; different.&lt;/p&gt;
&lt;p&gt;Here’s the code to accomplish the same thing OpenAI models can do with an &lt;a href="https://ollama.com/"&gt;Ollama&lt;/a&gt; model that supports function calling:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Eats pydantic model, shoots &amp;#39;json schema&amp;#39;&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;kw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;annotation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Parameter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inspect&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Input for `&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;function&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;function&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="vm"&gt;__doc__&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;parameters&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Don&amp;#39;t even try without this (email me if you have a better option!)&lt;/span&gt;

&lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;You are a genius expert. Your task is to classify the description of a financial transaction with an emoji.&lt;/span&gt;
&lt;span class="s2"&gt;You must only use emojis that are a single character.&lt;/span&gt;
&lt;span class="s2"&gt;You must only reply with a single emoji.&lt;/span&gt;

&lt;span class="s2"&gt;Example 1:&lt;/span&gt;
&lt;span class="s2"&gt;User: Treno Milano Ancona&lt;/span&gt;
&lt;span class="s2"&gt;Response: EmojiClassification(emoji=&amp;#39;🚆&amp;#39;)&lt;/span&gt;

&lt;span class="s2"&gt;Example 2:&lt;/span&gt;
&lt;span class="s2"&gt;User: Dinner with friends&lt;/span&gt;
&lt;span class="s2"&gt;Response: EmojiClassification(emoji=&amp;#39;🍽&amp;#39;)&lt;/span&gt;

&lt;span class="s2"&gt;Example 3:&lt;/span&gt;
&lt;span class="s2"&gt;User: Plane ticket to New York&lt;/span&gt;
&lt;span class="s2"&gt;Response: EmojiClassification(emoji=&amp;#39;✈️&amp;#39;)&lt;/span&gt;

&lt;span class="s2"&gt;Use the EmojiClassification tool to help you classify the transactions.&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# This is a bad example of retry logic - but you get the point&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;classification_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;User: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

        &lt;span class="c1"&gt;# poor man&amp;#39;s retry&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;retry_num&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
          &lt;span class="c1"&gt;# we use OpenAI&amp;#39;s client&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;system&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classification_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;retry_num&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;tool_calls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;tool_calls&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;tool_calls&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;EmojiClassification&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Failed to classify expense. Expected EmojiClassification but got &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;None&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ValidationError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Validation error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# we can now call this with ollama&lt;/span&gt;

&lt;span class="n"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Donut&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;llama3.1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http://localhost:11434/v1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ollama&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# dummy key - required&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# prints &amp;#39;🍩&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;details&gt;
  &lt;summary&gt;Note: Using Instructor to accomplish the same &lt;/summary&gt;

A kind user on &lt;a href=https://x.com/molasalex/status/1843553936407703666&gt;twitter&lt;/a&gt; reached out mentioning you could do the same with &lt;a href=https://python.useinstructor.com/&gt;Instructor&lt;/a&gt;

I got a lot of Validation errors at the time, so I did not go this route, the code gets a lot simpler, so worth the test!


&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;http://localhost:11434/v1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;ollama&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instructor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;instructor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TOOLS_STRICT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;classification_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;User: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;system&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classification_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EmojiClassification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;/details&gt;

&lt;p&gt;Perfect, we can now use any local Ollama model that supports function calling and classify our expense description as well.&lt;/p&gt;
&lt;p&gt;Why don’t we try some old-school machine learning?&lt;/p&gt;
&lt;h2 id="embeddings-for-classification"&gt;Embeddings for classification&lt;/h2&gt;
&lt;p&gt;Embeddings are another option. Open source, fast, and lightweight. Here’s a possible logic:  given a list of emoji and their descriptions, as well as the description of an expense, return the ‘most similar’ emoji description to the expense.&lt;/p&gt;
&lt;p&gt;Easier in code than in English:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# load a small db of emojis and descriptions&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;./emojis.csv&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;emoji&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# prepare column to be embedded (I hate this)&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;quot;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;to_embed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;passage: &amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# embed it (multilingual embedding, since the description are in IT and PT)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;intfloat/multilingual-e5-base&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;to_embed&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;show_progress_bar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;mps&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# create a lancedb table with the data&lt;/span&gt;

&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lancedb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.my_db&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;emojis&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;orient&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;records&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Created table &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Why not try a re-ranker as well?&lt;/span&gt;

&lt;span class="n"&gt;ranker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Reranker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;nreimers/mmarco-mMiniLMv2-L12-H384-v1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create a classification function&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embedding_classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# number of closest emoji to re-rank&lt;/span&gt;
    &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;query: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_progress_bar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;mps&amp;quot;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_pandas&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;emoji&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ranker&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;expense_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;doc_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;emoji&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Beers and pizza with friends&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# prints 🍕&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Almoco&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# prints ⚗&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Great, both the vanilla and re-ranked version of the function appear to work correctly. We now at least three ways of classifying the description of an expense.&lt;/p&gt;
&lt;h2 id="benchmarking-expense-categorization"&gt;Benchmarking expense categorization&lt;/h2&gt;
&lt;p&gt;We now have all the functions we need. With the code below, we now run a simple benchmark. For the 50 descriptions of expenses we gathered for our dataset, we run a set of different techniques/models.&lt;/p&gt;
&lt;p&gt;I didn’t test all models in the world. I tested &lt;code&gt;gpt-4o&lt;/code&gt;, 3-4 local models I usually run on my laptop, and the embedding techniques. Here’s the benchmarking code and results:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;CACHE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ExpenseItem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;description&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;tricount_emoji&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;openai&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classify_openai&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;llama3.1_7b: &amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;llama3.1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;api_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;qwen2.5_3b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;qwen2.5:3b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;api_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;qwen2.5_1.5b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;qwen2.5:3b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;api_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;llama3.2_3b&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;classify_expense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;llama3.2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;api_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;embedding&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;embedding_classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;embedding_reranker&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;embedding_classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_reranker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expense&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;expense&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expenses&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expenses&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;bench_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;bench_df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;description&lt;/th&gt;
&lt;th&gt;tricount_emoji&lt;/th&gt;
&lt;th&gt;openai&lt;/th&gt;
&lt;th&gt;llama3.1_7b&lt;/th&gt;
&lt;th&gt;qwen2.5_3b&lt;/th&gt;
&lt;th&gt;qwen2.5_1.5b&lt;/th&gt;
&lt;th&gt;llama3.2_3b&lt;/th&gt;
&lt;th&gt;embedding&lt;/th&gt;
&lt;th&gt;embedding_reranker&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Beers&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍻&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Beers&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍺&lt;/td&gt;
&lt;td&gt;🍻&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decatlon comun&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🏃&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🌳&lt;/td&gt;
&lt;td&gt;📨&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duarte decatlon&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🏃‍♂️&lt;/td&gt;
&lt;td&gt;🏃&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;🏦&lt;/td&gt;
&lt;td&gt;🌳&lt;/td&gt;
&lt;td&gt;🌳&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Festa&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🎉&lt;/td&gt;
&lt;td&gt;🎉&lt;/td&gt;
&lt;td&gt;🎉&lt;/td&gt;
&lt;td&gt;🎉&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🎭&lt;/td&gt;
&lt;td&gt;🎑&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jantar&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍽️&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;👩‍🍳&lt;/td&gt;
&lt;td&gt;👩‍🍳&lt;/td&gt;
&lt;td&gt;🏯&lt;/td&gt;
&lt;td&gt;⚱&lt;/td&gt;
&lt;td&gt;⚱&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Almoco&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍽️&lt;/td&gt;
&lt;td&gt;🍏&lt;/td&gt;
&lt;td&gt;🍔&lt;/td&gt;
&lt;td&gt;🍔&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;⚗&lt;/td&gt;
&lt;td&gt;⚗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jantar&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍽️&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;👩‍🍳&lt;/td&gt;
&lt;td&gt;👩‍🍳&lt;/td&gt;
&lt;td&gt;🏯&lt;/td&gt;
&lt;td&gt;⚱&lt;/td&gt;
&lt;td&gt;⚱&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hotel Bolo&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🏨&lt;/td&gt;
&lt;td&gt;🏚&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Treno Milano Ancona&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;🚂&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🚄&lt;/td&gt;
&lt;td&gt;🚡&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cena vigana milano&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍽️&lt;/td&gt;
&lt;td&gt;🍴&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🌖&lt;/td&gt;
&lt;td&gt;🌐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Train venezia milano&lt;/td&gt;
&lt;td&gt;🚂&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;🚂&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;td&gt;🚂&lt;/td&gt;
&lt;td&gt;🚄&lt;/td&gt;
&lt;td&gt;🚆&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reimbursement&lt;/td&gt;
&lt;td&gt;💳&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;⛺&lt;/td&gt;
&lt;td&gt;⛺&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🚑&lt;/td&gt;
&lt;td&gt;🎁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spritz&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍹&lt;/td&gt;
&lt;td&gt;🍹&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🥨&lt;/td&gt;
&lt;td&gt;✨&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mozzarella&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🧀&lt;/td&gt;
&lt;td&gt;🧀&lt;/td&gt;
&lt;td&gt;🧀&lt;/td&gt;
&lt;td&gt;🧀&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🍝&lt;/td&gt;
&lt;td&gt;🍄&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spritz&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍹&lt;/td&gt;
&lt;td&gt;🍹&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🥨&lt;/td&gt;
&lt;td&gt;✨&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Barca&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;⚽&lt;/td&gt;
&lt;td&gt;⚽&lt;/td&gt;
&lt;td&gt;🏀&lt;/td&gt;
&lt;td&gt;🏀&lt;/td&gt;
&lt;td&gt;🚣&lt;/td&gt;
&lt;td&gt;🔖&lt;/td&gt;
&lt;td&gt;📶&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kwikly&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;💳&lt;/td&gt;
&lt;td&gt;🏃&lt;/td&gt;
&lt;td&gt;🏃&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🥝&lt;/td&gt;
&lt;td&gt;🥝&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spesa&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;💳&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;💰&lt;/td&gt;
&lt;td&gt;💰&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;♠&lt;/td&gt;
&lt;td&gt;🌾&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voli Tokyo&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;💖&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✂&lt;/td&gt;
&lt;td&gt;🗼&lt;/td&gt;
&lt;td&gt;🗼&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conguaglio voli Zurigo&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✈️&lt;/td&gt;
&lt;td&gt;✈&lt;/td&gt;
&lt;td&gt;🔄&lt;/td&gt;
&lt;td&gt;🌐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Amar&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;👤&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🤑&lt;/td&gt;
&lt;td&gt;🕉&lt;/td&gt;
&lt;td&gt;💢&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Groceries&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;🥫&lt;/td&gt;
&lt;td&gt;🍲&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Durum&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🍞&lt;/td&gt;
&lt;td&gt;🥖&lt;/td&gt;
&lt;td&gt;🍞&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;📦&lt;/td&gt;
&lt;td&gt;🥁&lt;/td&gt;
&lt;td&gt;🛢&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Groceries&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🛍&lt;/td&gt;
&lt;td&gt;🥫&lt;/td&gt;
&lt;td&gt;🍲&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Money transfer&lt;/td&gt;
&lt;td&gt;💳&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;💸&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;💰&lt;/td&gt;
&lt;td&gt;🤑&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lunch&lt;/td&gt;
&lt;td&gt;🍜&lt;/td&gt;
&lt;td&gt;🍔&lt;/td&gt;
&lt;td&gt;🍏&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🍽&lt;/td&gt;
&lt;td&gt;🥪&lt;/td&gt;
&lt;td&gt;🕐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cafe&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;☕&lt;/td&gt;
&lt;td&gt;🔪&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spesa kwikly&lt;/td&gt;
&lt;td&gt;💶&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;🛒&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🐳&lt;/td&gt;
&lt;td&gt;💪&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pizza&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;td&gt;🍕&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;em&gt;Note: &lt;code&gt;None&lt;/code&gt; means the model did not call the appropriate function, and thus failed to produce a classification.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;Looking at the results, we can observe a couple of interesting things:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OpenAI’s &lt;code&gt;gpt4-o&lt;/code&gt; is hard to beat.&lt;/strong&gt; It pretty much &lt;em&gt;never&lt;/em&gt; missed. Even when Vitto was too lazy to write the rest of the description. My guilty pleasure shopping at decathlon? Got it. &lt;em&gt;Barca&lt;/em&gt; was actually referring to &lt;em&gt;boat&lt;/em&gt; - and not the actual football club - there’s no way it could’ve figured that one out. Hard to beat. &lt;em&gt;Kwikly&lt;/em&gt; refers to a local supermarket we go to in Copenhagen - and that looks like a store emoji to me!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The bigger the open model, the better the results.&lt;/strong&gt; And Llama 3 shows it. It performs pretty well, especially at the 7b size. Not without hiccups, of course (can’t figure out how 💖 relates to flights?).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Small open source models are getting good.&lt;/strong&gt; Not sure what happened to Llama 3.2 3b - but at 3 and 1.5b sizes, Alibaba’s Qwen2 family of models perform surprisingly well. Some bigger problems &lt;em&gt;do&lt;/em&gt; show up. We also see a lot of failures to call the right function. (hence the &lt;code&gt;None&lt;/code&gt;s). I’m still curious about two things. (1) How much better could we get through prompting. (2) How much better larger open source models perform (think 70B or 400B). I can’t run those myself, so it’s beside the point of this post.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The embedding idea sounded interesting, but didn’t really perform.&lt;/strong&gt; We would probably have to do a bit more work than just a cosine similarity on the emoji descriptions. Even though the embeddings were multilingual, they clearly didn’t capture the essence of the expense. Another approach I would test is to create a labeled dataset with &lt;code&gt;gpt-4o&lt;/code&gt; and then a simple random forest on embeddings. But again, beside the point of this post.&lt;/p&gt;
&lt;p&gt;In any case, we can for sure do better than what Tricount is currently doing!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2 id="epilogue-just-get-the-right-token"&gt;Epilogue: Just get the right token&lt;/h2&gt;
&lt;p&gt;Sorry, I couldn’t resist. I need to test a last one. OpenAI has a cool feature called &lt;a href="https://cookbook.openai.com/examples/using_logprobs"&gt;log probs&lt;/a&gt; in their API that allows you to run classification by querying by which token has the highest probability in the completion.&lt;/p&gt;
&lt;p&gt;Unfortunately, &lt;a href="https://github.com/ollama/ollama/issues/2415"&gt;Ollama doesn’t seem to support this&lt;/a&gt;. But I figured how to do with plain Transformers. Not the most efficient, since I have to run for every possible emoji. But do let me know if you (reader) have better suggestions:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;meta-llama/Llama-3.2-1B-Instruct&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# log probability of sentence&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_sentence_log_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logits&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:]&lt;/span&gt;  &lt;span class="c1"&gt;# Align token predictions&lt;/span&gt;
    &lt;span class="n"&gt;shifted_input_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;  &lt;span class="c1"&gt;# Drop first token&lt;/span&gt;

    &lt;span class="n"&gt;log_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log_softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;token_log_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shifted_input_ids&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsqueeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;squeeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;token_log_probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# wrapper to run through the emojis&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_most_likely_emoji&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_sentence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;emoji_list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;log_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;emoji&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;emoji_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_sentence&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log_probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;calculate_sentence_log_probability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;most_likely_emoji&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;log_probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;most_likely_emoji&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_probs&lt;/span&gt;

&lt;span class="c1"&gt;# prompt&lt;/span&gt;

&lt;span class="n"&gt;USER_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;You are a genius expert. Your task is to classify the description of a financial transaction with an emoji.&lt;/span&gt;
&lt;span class="s2"&gt;You must only use emojis that are a single character.&lt;/span&gt;
&lt;span class="s2"&gt;You must only reply with a single emoji.&lt;/span&gt;

&lt;span class="s2"&gt;Example 1:&lt;/span&gt;
&lt;span class="s2"&gt;User: Treno Milano Ancona&lt;/span&gt;
&lt;span class="s2"&gt;Response: 🚆&lt;/span&gt;
&lt;span class="s2"&gt;...&lt;/span&gt;

&lt;span class="s2"&gt;Now it&amp;#39;s your turn! Classify the following transactions with an emoji:&lt;/span&gt;
&lt;span class="s2"&gt;User: &lt;/span&gt;&lt;span class="si"&gt;{description}&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;USER_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;USER_MESSAGE&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Voli Tokyo&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# &amp;#39;Flights Tokyo&amp;#39;&lt;/span&gt;

&lt;span class="c1"&gt;# massaging the formats and tokenizing&lt;/span&gt;

&lt;span class="n"&gt;ASSISTANT_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;Response: &lt;/span&gt;&lt;span class="si"&gt;{emoji}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;user&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;USER_MESSAGE&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;role&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;assistant&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ASSISTANT_MESSAGE&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;input_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply_chat_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;base_sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# classification&lt;/span&gt;

&lt;span class="n"&gt;emoji_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;🎄&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🔔&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🚨&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;💎&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🇬🇹&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🍹&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🚣&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;⚽&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🥘&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🛩️&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;🇯🇵&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;✈️&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# example possibilities&lt;/span&gt;
&lt;span class="n"&gt;most_likely_emoji&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;find_most_likely_emoji&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_sentence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;emoji_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Most likely emoji: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;most_likely_emoji&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_prob&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;log_probs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Log probability of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;emoji&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log_prob&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Most likely emoji: ✈️&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 🎄: -582.9215087890625&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 🔔: -580.827392578125&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 🚨: -577.4615478515625&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 💎: -582.4413452148438&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 🇬🇹: -580.4923095703125&lt;/span&gt;

&lt;span class="c1"&gt;# Log probability of 🍹: -581.0596923828125&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Works well!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Around Iceland in 6 days</title><link href="https://duarteocarmo.com/blog/around-iceland-6-days-camping-itinerary.html" rel="alternate"/><published>2024-08-30T15:30:00+02:00</published><updated>2024-08-30T15:30:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-08-30:/blog/around-iceland-6-days-camping-itinerary.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/landscape.png" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;For a short summer break, Vitto challenged me to visit Iceland with her. As someone that is used to going south for the summer - this sounded stupid at first, but I was &lt;em&gt;incredibly&lt;/em&gt; surprised. For 6 days, we were pretty much in another planet. Here are some notes and tips …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/landscape.png" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;For a short summer break, Vitto challenged me to visit Iceland with her. As someone that is used to going south for the summer - this sounded stupid at first, but I was &lt;em&gt;incredibly&lt;/em&gt; surprised. For 6 days, we were pretty much in another planet. Here are some notes and tips from our trip.&lt;/p&gt;
&lt;h2 id="minimal-pre-trip-preparations"&gt;Minimal pre-trip preparations&lt;/h2&gt;
&lt;p&gt;We only did two things in preparation for the trip. The first was renting out our home for the week. After some googling we decided to go with &lt;a href="https://www.kukucampers.is/"&gt;Kuku Campers&lt;/a&gt;. It’s a ‘local’ shop, ran mostly by summer working EU immigrants. Price was reasonable, staff was friendly. No complaints.&lt;/p&gt;
&lt;p&gt;The second most important preparation is the most critical. It’s a small trick I learned 4 years ago from my friend Alex. That little trick is called &lt;em&gt;&lt;a href="https://www.amazon.fr/Guide-Routard-Islande-2024-25/dp/2017888397/ref=sr_1_1?dib=eyJ2IjoiMSJ9.MBhiA55sDoSmJRDHLdCaoFNDtWciwNpW3SO7B2P9O8VEC35Mgt6SIiZlO5-H3t6hIg5cYtxZgYfIxbwm_DF3b_CoNFUHz3BhhiAtRd50Iz8rJA7Ak1CH_g0UZJw05ahkilvyujQkRD6weKxwJsMjZDvMttgk_Bp1Pa97vDQMQYtX8ZLyQ4aIhJByrEA-FznfPvUveIIbRLk1d9Tvrt52Wf81sy7Aa_Nrnoz8xoUbRA0.OIxFW121aSdMTzierOgfEtop-VwDN0CT9EBz_dTrDog&amp;amp;dib_tag=se&amp;amp;keywords=Guide+Routard+Islande&amp;amp;qid=1724441068&amp;amp;s=books&amp;amp;sr=1-1"&gt;Le Petit Routard&lt;/a&gt;&lt;/em&gt;. We bought the Iceland 2024 Edition. As always, it did &lt;em&gt;not&lt;/em&gt; disappoint. It’s just like traveling around with a local guide. And there’s just something about the French way of &lt;em&gt;routard&lt;/em&gt; travel that has always stuck with me. You won’t know until you’ve tried it.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;&lt;/p&gt;
&lt;iframe src="https://www.google.com/maps/d/u/1/embed?mid=15DR367Ywerql0T_GuuHBby5EN4nrgUI&amp;ehbc=2E312F" style="width: 100%; height: 480px; max-width: 100%;" allowfullscreen&gt;&lt;/iframe&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="day-1-lift-off-golden-triangle"&gt;Day 1: Lift off &amp;amp; Golden Triangle&lt;/h2&gt;
&lt;p&gt;We woke up early in Reykjavik, had breakfast at the &lt;a href="https://www.centerhotels.com/en/hotel-plaza-reykjavik"&gt;Center Hotels Plaza&lt;/a&gt;. Once done, we went back to the Airport region of Keflavik and picked up our camping van. We set way towards the first ‘&lt;em&gt;coup the coeur&lt;/em&gt;’ from the Routard: the Golden Triangle Region.&lt;/p&gt;
&lt;p&gt;We managed to visit the place where the two tectonic plates meet, the &lt;a href="https://www.thingvellir.is/en/"&gt;Þingvellir national park&lt;/a&gt;. Which brought me some good memories of Geological walks with my grandparents. Also managed to quickly hit the original &lt;em&gt;Geysir&lt;/em&gt; (reminded me often of what we see in the Azores region). Except Icelandic people don’t &lt;a href="https://youtu.be/F9piJgQKqRQ?si=ip08U8DCYfLfI1Uk"&gt;cook underground like us&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Closed off Day 1 with our first &lt;a href="https://g.co/kgs/cLGDmwS"&gt;campsite&lt;/a&gt;. Nothing special - pretty noisy.&lt;/p&gt;
&lt;h2 id="day-2-running-and-waterfalls"&gt;Day 2: Running and Waterfalls&lt;/h2&gt;
&lt;p&gt;The marathon training didn’t necessarily stop. We took the early morning and decided to go for a long run along the woods near the Geyser region. We ended up getting lost and running 17km in the middle of the Lunar landscape. Beautiful.&lt;/p&gt;
&lt;p&gt;The second part of the day was all about waterfalls. We drove to the final element of the Golden Triangle: the &lt;a href="https://maps.app.goo.gl/o8j1sqfVBtxPad747"&gt;Gullfoss waterfall&lt;/a&gt;. Once done, we started making our way east while listening to a great &lt;a href="https://podcasts.apple.com/us/podcast/the-uniqueness-of-iceland/id1528861092?i=1000657271696"&gt;podcast episode&lt;/a&gt; about Icelandic Geology.&lt;/p&gt;
&lt;p&gt;On the road to &lt;a href="https://maps.app.goo.gl/oDSkGTL9hgBUq6wd9"&gt;Vik&lt;/a&gt;, two wonders: &lt;a href="https://www.google.com/maps/place/Seljalandsfoss/@63.6114097,-19.988622,13.49z/data=!4m6!3m5!1s0x48d71eade8ef2415:0xae01e6205209178d!8m2!3d63.6156232!4d-19.9885688!16zL20vMDMyZ2tq?entry=ttu&amp;amp;g_ep=EgoyMDI0MDgyMy4wIKXMDSoASAFQAw=="&gt;Seljalandsfoss&lt;/a&gt; is quite a sight. You can go around it too! &lt;a href="https://maps.app.goo.gl/tshJTU2MwUaj75Sc6"&gt;Hestavaðsfoss&lt;/a&gt; was also impressive. Even though our legs were a bit cooked, we hiked a bit as well. Next time, we’ll do it a bit more in this region.&lt;/p&gt;
&lt;p&gt;Tired, we finally reached the &lt;a href="https://maps.app.goo.gl/AzKHjnnYykrs34F1A"&gt;campsite&lt;/a&gt; in Vik. Some well deserved showers, some subpar tortellini and called it a night. Woke up in the middle of the night to a shaking van. Strong winds in the north.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/waterfalls_optim.png" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="day-3-a-beach-full-of-ice"&gt;Day 3: A beach full of ice&lt;/h2&gt;
&lt;p&gt;We woke up early to the sound of the wind. After some breakfast and packing, we drove along the east towards the &lt;a href="https://maps.app.goo.gl/XjzPXRTZRNP5fy4DA"&gt;Svínafellsjökull Glacier&lt;/a&gt;. A beautiful location where you park the van, walk ~4/5 km and see the beginning of where the glacier comes down from the mountain.&lt;/p&gt;
&lt;p&gt;After another 45 min drive and we finally ended up at one of the highlights of the trip: &lt;a href="https://maps.app.goo.gl/tUfjSYeX2vnzGXFbA"&gt;Diamond Beach&lt;/a&gt;. It’s a fantastic scenery. It sits just close to there the Jökulsárlón lake touches the Ocean. The glacier melts into the lake, and the large blocks of ice are washed up into the Ocean, some of them end up in a beautiful black sanded beach, melting away. &lt;a href="/photos/f7bb8254f387803f717e339dde92f0e191a023f9.html"&gt;It’s hard to put into words&lt;/a&gt;, but it’s quite the sight.&lt;/p&gt;
&lt;p&gt;Recklessly ignoring the weather warnings, we then drove along the Eastern coast. It was a scary drive, with &lt;em&gt;very strong&lt;/em&gt; winds; between the mountain and the ocean - on a narrow road. Had we gotten a larger van, we would’ve probably stopped right then and there.&lt;/p&gt;
&lt;p&gt;But with Vitto’s co-piloting, we managed to get into the &lt;a href="https://maps.app.goo.gl/2xNgDRZZ4FXLM9G96"&gt;Fossárdalur Campsite&lt;/a&gt;; probably the most beautiful campsite we stayed at. Hidden amongst the Fjords, sheltered from the wind. The kitchen area was super cozy, with electricity, many stoves, and some long tables to enjoy a warm meal.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/whales_optim.png" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="day-4-volcanos-and-thermal-baths"&gt;Day 4: Volcanos and Thermal baths&lt;/h2&gt;
&lt;p&gt;We started heading north-west. It’s incredible how the landscape changes while you’re driving in Iceland. It’s truly a moonlike place. On our way to the Mývatn region, we stopped for in one of the strongest waterfalls in the world: the &lt;a href="https://maps.app.goo.gl/GRY1depqdGGkdpko9"&gt;Dettifoss&lt;/a&gt; waterfall. You would’ve never guessed there’s a waterfall there! Especially that large. A bit of walking in a moonlike landscape and BAM! A gigantic waterfall.&lt;/p&gt;
&lt;p&gt;A couple more &lt;a href="https://serialpodcast.org/season-one"&gt;Serial&lt;/a&gt; episodes and driving, and we finally reached the Mývatn lake region. The landscape looks like nothing we’ve ever seen before, it’s just like a Volcano erupted near a lake, and transformed its landscape in a Halong-bay like way.&lt;/p&gt;
&lt;p&gt;During the afternoon, I challenged Vitto to a long run towards a volcano we spotted. At first, we thought we would just get near it, but once we got close enough, it was obvious some people were actually hiking to the top! We had to do the same. The view wasn’t mind-blowing, but the challenge was fun. There’s nothing quite like running as a couple.&lt;/p&gt;
&lt;p&gt;After that long run/trail we were both pretty tired again. We set our GPS to the &lt;a href="https://maps.app.goo.gl/j5gT9iTg6rbYfY7h8"&gt;Mývatn Nature Baths&lt;/a&gt;. If you’ve heard of the &lt;a href="https://www.bluelagoon.com/"&gt;Blue Lagoon&lt;/a&gt;, it’s similar - but a little less packed and touristy. Oh, and it was open until 22:00.&lt;/p&gt;
&lt;p&gt;Once we had enough of the blue water, jacuzzi like temperature, and volcanic views, we decided to call it a night. We got to &lt;a href="https://maps.app.goo.gl/4BHGV3BZi9LL2SFn9"&gt;Camping Myvatn&lt;/a&gt;, another scenic campsite overlooking the lake. A little Barilla and pesto to fill up the belly, and we called it a night.&lt;/p&gt;
&lt;h2 id="day-5-whales-and-fjords"&gt;Day 5: Whales and Fjords&lt;/h2&gt;
&lt;p&gt;When I was young, I was pretty lucky to see the mythical &lt;a href="https://en.wikipedia.org/wiki/Blue_whale"&gt;Blue Whales&lt;/a&gt; in Açores. But Vitto had never seen any! We were told by many that Húsavík was &lt;em&gt;the&lt;/em&gt; capital of whale watching. But we decided we were going to try our luck in the less known &lt;a href="https://www.whalewatchingakureyri.is/"&gt;Akureyri&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We woke up early, packed up our stuff, and drove all the way to Akureyri (to catch the 10:00 AM tour). We had a great time, and saw at least 3/4 whales in different occasions. The tour was well organized, and the staff was calm and professional. The weather was also great, and the whales were out.&lt;/p&gt;
&lt;p&gt;From Akureyri we did another long stretch of driving all the way to the North West region of Iceland. Supposedly less known and more remote. The roads were in less good shape with lots of gravel, but the van handled it.&lt;/p&gt;
&lt;p&gt;Our goal was to reach the &lt;a href="https://maps.app.goo.gl/7zA6fRGYWuTWZXbB9"&gt;Flókalundur&lt;/a&gt; region. The views and road were absolutely mind-blowing. The MAJESTIC Fjords mixed with a Lisbon-like sunlight, is something I have never seen before.&lt;/p&gt;
&lt;p&gt;Just before arriving to camp, we also discovered the little &lt;a href="https://maps.app.goo.gl/aGjwFPPnm1iQsvPy9"&gt;Hellalaugur hot spring&lt;/a&gt;. Just like a little stone jacuzzi in the middle of the rocks. “What language are you guys speaking?” - Asked a German family. “Oh. We speak a mix of Portuguese and Italian”; Always fun looking at their reaction. “We got some of the Italian but couldn’t decrypt the &lt;em&gt;other one&lt;/em&gt;”.&lt;/p&gt;
&lt;p&gt;We decided to indulge ourselves and eat at the &lt;a href="https://maps.app.goo.gl/6Dz67Xq3N1TFpLmM9"&gt;Hotel/Campsite/Restaurant&lt;/a&gt;.  The food was good and reasonably priced! The views from the van were stunning.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/beach_van_optim.png" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="day-6-famous-waterfalls-and-road-to-reykjavik"&gt;Day 6: Famous waterfalls and road to Reykjavík&lt;/h2&gt;
&lt;p&gt;We woke up in the middle of the Fjords and set out towards the Dynjandi falls. Arrived early but was already packed. Still, the &lt;a href="https://maps.app.goo.gl/9PMwVq1uqFCdnwG57"&gt;Hæstahjallafoss&lt;/a&gt; falls were quite a site. For a moment, it felt like we were inside an aquarium.&lt;/p&gt;
&lt;p&gt;Kept driving along the coast, and reached the &lt;a href="https://maps.app.goo.gl/juWgKudG6PAHHcfG9"&gt;Kirkjufellsfossar waterfall&lt;/a&gt;. It’s a nice little waterfall that faces a funny shaped mountain called &lt;a href="https://maps.app.goo.gl/D92W4E1sbjgW5bPRA"&gt;Hálsaból Sumarhús&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;And we kept driving. We drove most of the day actually. Even though we wanted to get close to Reykjavík, we didn’t want to go back just yet.&lt;/p&gt;
&lt;p&gt;Somehow, probably through some Scandinavian friends, Vitto got a recommendation to hit the &lt;a href="https://maps.app.goo.gl/KVGhW8nyvfmQEEAz7"&gt;Hammsvik Hot Springs&lt;/a&gt;. These were even better than the previous ones we had visited. The weather was cloudy, but with the 4/5 pools near the Fjords’ bay, the scenery was &lt;em&gt;also&lt;/em&gt; magical. The place closed at 22:00, and we left just before 21:55.&lt;/p&gt;
&lt;p&gt;Vitto managed to find a small little camping farm called &lt;a href="https://maps.app.goo.gl/mJ9vjQTZsyRaK2Wz8"&gt;Hjalli Kjós&lt;/a&gt;, we had no big expectations for, but ended up being a nice surprise. Managed by a friendly elderly couple. Clean, cosy, shoes off in common areas. The perfect place to take a shower, cook the last portion of Barilla and hit the bunk for the night.&lt;/p&gt;
&lt;h2 id="day-7-going-back-home"&gt;Day 7: Going back home&lt;/h2&gt;
&lt;p&gt;The last day was pretty uneventful. We woke up early again, took a shower, and drove down to Reykjavík. We dropped the van at &lt;a href="https://maps.app.goo.gl/DXe8fNc5PpSKXSUr9"&gt;KuKu&lt;/a&gt;, made sure everything was fine with it, and were dropped off at the Airport to fly home.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/69/routard.jpeg" alt="Icelandic Landscape" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="what-well-do-differently-in-the-next-trip"&gt;What we’ll do differently in the next trip&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;4x4 Camping Van&lt;/strong&gt;. Our van was great. Spacious and well-equipped. We were two, and it could easily sleep 3. However, it was a bit too large and “urban”, and made some of the gravel roads a bit uncomfortable. We didn’t have much choice when booking. But next time, I would probably go with something like &lt;a href="https://happycampers.is/camper/happy-4x4-3-pax/"&gt;this&lt;/a&gt; instead.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Less Driving&lt;/strong&gt;: We drove a lot. We wanted to do the complete tour of Iceland. But there were a couple of days when we drove 8+ hours. Even though we listen to nice podcasts, and had nice and long talks, it was a bit too much for our taste. Hey - there are no perfect trips.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;More time&lt;/strong&gt;: We had a full 7 days in Iceland. But looking back, I wish we had just a bit more time. I think 8-9 days would be the ideal. Yeah, sure, you could be a month in Iceland. I’m talking about the type of trip &lt;em&gt;we&lt;/em&gt; like to do.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Less Running, More Hiking&lt;/strong&gt;: Our trip was in the middle of summer, in a week in August. But we were also training for an upcoming marathon. So we had to squeeze in runs here in there. That was fun, and we did a lot of those together. But perhaps we would’ve enjoyed even more if we replaced some of those runs for hikes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Vitto was right. When the summer hits, we, southern Europeans, have this tendency to go South. But there was something special about going North this time. As she said: 9 out of 10.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Tanda Runner: A personalized running dashboard</title><link href="https://duarteocarmo.com/blog/tanda-runner.html" rel="alternate"/><published>2024-07-22T17:55:00+02:00</published><updated>2024-07-22T17:55:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-07-22:/blog/tanda-runner.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://tandarunner.duarteocarmo.com"&gt;
&lt;img src="https://duarteocarmo.com/images/68/app.png" alt="Tanda Runner Screenshot"
style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://tandarunner.duarteocarmo.com/"&gt;Tanda Runner&lt;/a&gt; is a web app that shows me the things I care most about when preparing my next marathon. I've also added a running coach/agent designed to give me actionable feedback about my training. Some of that feedback is &lt;em&gt;probably&lt;/em&gt; hallucinatory - I'll get to it in a bit …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://tandarunner.duarteocarmo.com"&gt;
&lt;img src="https://duarteocarmo.com/images/68/app.png" alt="Tanda Runner Screenshot"
style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://tandarunner.duarteocarmo.com/"&gt;Tanda Runner&lt;/a&gt; is a web app that shows me the things I care most about when preparing my next marathon. I've also added a running coach/agent designed to give me actionable feedback about my training. Some of that feedback is &lt;em&gt;probably&lt;/em&gt; hallucinatory - I'll get to it in a bit.&lt;/p&gt;
&lt;p&gt;It's built with &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; - a framework I've gravitated to more and more when building these types of apps. It has all the batteries I need and a time-tested &lt;a href="https://djangopackages.org/grids/g/for-comparison/"&gt;ecosystem&lt;/a&gt;. The front-end is a Frankenstein that uses Django &lt;a href="https://channels.readthedocs.io/en/latest/"&gt;channels&lt;/a&gt;, &lt;a href="https://htmx.org/"&gt;htmx&lt;/a&gt;, and some vanilla JavaScript. It’s not the most responsive PWA out there - but it's still a pretty usable experience. The chat interface is adapted from &lt;a href="https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md"&gt;LLaMA.cpp&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The most interesting part is the small LLM coach/agent that gives feedback and tips regarding my performance. It's an &lt;em&gt;untested&lt;/em&gt; and &lt;em&gt;non-evaluated&lt;/em&gt; multi step LLM pipeline that extracts running-related insights from around the web, and uses them to give me personalized feedback using my &lt;a href="https://www.strava.com"&gt;Strava&lt;/a&gt; data.&lt;/p&gt;
&lt;p&gt;To do this, I extract transcripts from marathon related YouTube videos (which I probably &lt;a href="https://www.youtube.com/watch?v=xiJMjTnlxg4"&gt;shouldn't&lt;/a&gt;), and use those to generate a list of insights related to running and training. Once I have those insights, I use an LLM to translate those them into Pandas code. Once that's done I can feed an LLM the insight, the code, and the outcome of running such code on my own data. Sounds confusing? I know it does - but it works surprisingly well!&lt;/p&gt;
&lt;p&gt;This pipeline could probably be significantly improved by designing some evaluations, some metrics, and continuously improving it. But for now, I'll use it to prepare &lt;a href="https://www.marathon06.com/2024/AN/"&gt;Nice&lt;/a&gt;, and make any adjustments I see fit.&lt;/p&gt;
&lt;p&gt;The visualizations and graphs are built using &lt;a href="https://altair-viz.github.io/"&gt;Altair&lt;/a&gt;. They are a mix of my favorite graphs from &lt;a href="https://crplot.com/"&gt;Christoph’s CR Plots&lt;/a&gt;, Strava, and my own brain. The name (and most of the visualizations) come from &lt;a href="https://scholar.google.co.uk/citations?view_op=view_citation&amp;amp;hl=en&amp;amp;user=C__krSUAAAAJ&amp;amp;cstart=20&amp;amp;pagesize=80&amp;amp;citation_for_view=C__krSUAAAAJ:j3f4tGmQtD8C"&gt;Giovanni Tanda&lt;/a&gt;'s work - which I encourage you to read through!&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Do things that don't scale</title><link href="https://duarteocarmo.com/blog/do-things-that-dont-scale.html" rel="alternate"/><published>2024-06-06T10:00:00+02:00</published><updated>2024-06-06T10:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-06-06:/blog/do-things-that-dont-scale.html</id><summary type="html">&lt;p&gt;I'm lucky enough to work with some pretty talented folks. During a recent offsite, one of them gave a completely improvised talk. He had just gotten back home from holidays - so naturally, they didn't really have time to prepare anything. But what they said resonated.&lt;/p&gt;
&lt;p&gt;It was about &lt;em&gt;improvisation&lt;/em&gt;. How …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'm lucky enough to work with some pretty talented folks. During a recent offsite, one of them gave a completely improvised talk. He had just gotten back home from holidays - so naturally, they didn't really have time to prepare anything. But what they said resonated.&lt;/p&gt;
&lt;p&gt;It was about &lt;em&gt;improvisation&lt;/em&gt;. How a lot of things we do every day are - to some extent, made up! From how we greet each other, to how we respond to an unexpected phone call. None of it is planned. We don't create some sort of crazy scalable distributed architecture to deal with these things: They're &lt;em&gt;&lt;a href="https://en.wikipedia.org/wiki/Lick_(music)"&gt;licks&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;For some, this felt counter-intuitive. For me, it really hit home.&lt;/p&gt;
&lt;p&gt;But I understand how it might feel counter-intuitive. Experienced engineers think &lt;em&gt;hard&lt;/em&gt; before solving problems - they're expected to. "Hey - we've seen this before..." -  They'll research. They'll anticipate problems. Even when the problem is not well-defined, they'll work hard on defining it! It doesn't even cross our minds that a badly defined problem is probably not a problem to begin with. With all this thinking, they enter &lt;em&gt;the ether&lt;/em&gt;. The ether is that stale anticipation, that hesitation. It takes many forms - but in the engineering world it's normally in the form of endless research, stalling, no decisions being made. The ether is this "pre-problem" hesitation.&lt;/p&gt;
&lt;p&gt;The talk also argued about how perfect is &lt;a href="https://www.youtube.com/watch?v=CZ8fTfpyqpQ&amp;amp;t=211s"&gt;the ENEMY&lt;/a&gt; of good enough. While we're thinking of the perfect solution, the original problem is &lt;em&gt;still&lt;/em&gt; there. Users don't see your research, users don't see the architecture meetings, users don't see &lt;em&gt;the ether&lt;/em&gt;. The only thing they see is the problem. And guess what? The problem is still there.&lt;/p&gt;
&lt;p&gt;And yeah, good enough is likely pretty bad. The first solution, will probably suck. It was fast, we were in a rush. We just wanted to put something out there to solve a problem! And obviously, we didn't think of all the ways it would/could break.&lt;/p&gt;
&lt;p&gt;It's a privilege to build something that doesn't scale. We might argue that if you get stuck in the ether in the first place, you won't even get to scale being problem. Scale is a great problem to have. It means your solution &lt;em&gt;started&lt;/em&gt; solving a problem - and people want more of it. This is a good thing.&lt;/p&gt;
&lt;p&gt;And yes, now we have &lt;em&gt;double the trouble&lt;/em&gt;. Now we need to scale AND handle users at the same time. But what's the alternative? Getting stuck in the ether and potentially building something we don't even know if people will use?&lt;/p&gt;
&lt;p&gt;I like to think that doing things that don't scale will likely save time in the long run. Lots of things we build don't get used. And so if you spend a lot of time in the ether - you risk that time being completely wasted. And there are no guarantees you'll likely be able to anticipate all the ways you won't scale. But getting something out there, in front of people, takes you from 0 to 1. And the faster we realize what we've built doesn't solve anyone's problem, the better.&lt;/p&gt;
&lt;p&gt;And solving problems is what we're here for.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>How I self-host in 2024</title><link href="https://duarteocarmo.com/blog/how-i-self-host-in-2024.html" rel="alternate"/><published>2024-04-28T19:00:00+02:00</published><updated>2024-04-28T19:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-04-28:/blog/how-i-self-host-in-2024.html</id><summary type="html">&lt;p&gt;I'm a sucker for side projects. There's nothing quite like building something to learn about it.&lt;/p&gt;
&lt;p&gt;Over the last 5+ years I've accumulated a little over 15 small web apps and websites. Almost all of them are hosted on a small Hetzner server and deployed using &lt;a href="https://duarteocarmo.com/blog/down-from-the-cloud-self-hosting.html"&gt;a mix of ssh …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'm a sucker for side projects. There's nothing quite like building something to learn about it.&lt;/p&gt;
&lt;p&gt;Over the last 5+ years I've accumulated a little over 15 small web apps and websites. Almost all of them are hosted on a small Hetzner server and deployed using &lt;a href="https://duarteocarmo.com/blog/down-from-the-cloud-self-hosting.html"&gt;a mix of ssh, docker-compose, and Caddy&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Some months ago my small server started giving me some problems. The 4 vCPUs and 16 GB RAM weren't enough, and hiccups became more frequent. I also want to run more and more small LLMs on it, which turned out to be &lt;em&gt;&lt;a href="https://github.com/duarteocarmo/lusiaidas/blob/master/app.py#L1"&gt;challenging&lt;/a&gt;&lt;/em&gt;. The whole &lt;a href="https://github.com/duarteocarmo/governosombra/blob/master/.github/workflows/workflow.yml"&gt;"ssh with root using github actions"&lt;/a&gt; started to look like more of a limitation than a feature.&lt;/p&gt;
&lt;p&gt;It was time for an upgrade.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/66/coolify-apps.png" alt="Coolify apps"
style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;First, I needed to upgrade my hardware. Not sure if &lt;a href="https://www.hetzner.com/"&gt;Hetzner&lt;/a&gt; is the cheapest out there, but after using them for 5+ years I can definitely say they're reliable. I went with their &lt;a href="https://www.hetzner.com/sb"&gt;server auction&lt;/a&gt;, which is a great (and sustainable) way of getting a nice refurbished server for a fair price. For about 30 EUR/month, I upgraded to a nice Intel i7 with 64GB RAM(!). Should be more than enough.&lt;/p&gt;
&lt;p&gt;With better hardware, I needed to decide whether I would keep my &lt;a href="https://duarteocarmo.com/blog/down-from-the-cloud-self-hosting.html"&gt;old setup&lt;/a&gt;, or upgrade to something a bit more &lt;em&gt;streamlined&lt;/em&gt; (god I &lt;em&gt;hate&lt;/em&gt; that word). Turns out, there are quite some open source PaaS alternatives. From &lt;a href="https://dokku.com/"&gt;Dokku&lt;/a&gt;, &lt;a href="https://caprover.com/"&gt;CapRover&lt;/a&gt;, to DHH's &lt;a href="https://kamal-deploy.org/"&gt;Kamal&lt;/a&gt;. The one who caught my eye the most was &lt;a href="https://coolify.io/"&gt;Coolify&lt;/a&gt;. Without much due diligence, it seemed fit most of what I needed: &lt;a href="https://github.com/coollabsio/coolify"&gt;Open source&lt;/a&gt;, &lt;a href="https://coolify.io/docs/installation"&gt;easy to install&lt;/a&gt;, &lt;a href="https://github.com/coollabsio/coolify/commits/main/"&gt;active project&lt;/a&gt;, &lt;a href="https://coolify.io/docs/screenshots"&gt;with a polished interface&lt;/a&gt;, &lt;a href="https://coolify.io/docs/knowledge-base/git/github/integration"&gt;integrates&lt;/a&gt; with GitHub, and supports docker-based deployments. That's not an easy list to check-off.&lt;/p&gt;
&lt;p&gt;The migration was pretty smooth, except for &lt;a href="https://aicoverlettercreator.com/"&gt;a&lt;/a&gt; &lt;a href="https://infrequent.app/"&gt;couple&lt;/a&gt; of more complicated Django applications with databases on docker volumes. Since everything was already running on containers, it was pretty much plug-and-play. In a couple of days everything was migrated.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/66/kuma-screenshot.png" alt="Kuma screenshot"
style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;There are a lot of things I love about Coolify. It scans my GitHub repos and redeploys when needed via webhooks. It also sends me Telegram notifications when something gets updated to a new version. Finally, it also provides a wide range of other applications you can deploy (Databases, S3-compatible storages, and a &lt;a href="https://coolify.io/docs/resources/services/index"&gt;wide range&lt;/a&gt; of other services).&lt;/p&gt;
&lt;p&gt;Don't get me wrong - I love the Cloud. But essentially, the Cloud is just someone else’s computer. So why shouldn’t it be my computer? Self-hosting everything means I can pay a fixed price for my hardware, and &lt;em&gt;not&lt;/em&gt; for the number of applications that run on it. And even though I did like my little &lt;em&gt;hacked-together&lt;/em&gt; setup, I love being able to use, AND &lt;a href="https://github.com/coollabsio/coolify/pull/2028"&gt;contribute&lt;/a&gt; to a promising open source project!&lt;/p&gt;
&lt;p&gt;There's something incredible about what a &lt;a href="https://github.com/andrasbacsai"&gt;1-man team&lt;/a&gt; can accomplish with the support of the open source community.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>The best keyboard I've ever had</title><link href="https://duarteocarmo.com/blog/happy-hacking-keyboard-review.html" rel="alternate"/><published>2024-04-22T21:30:00+02:00</published><updated>2024-04-22T21:30:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-04-22:/blog/happy-hacking-keyboard-review.html</id><summary type="html">&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/65/top.png" alt="Top view hhkb" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There I was. Ready to click the big blue purchase button. &lt;a href="https://www.amazon.de/-/en/Keyboard-PD-KB800WS-Professional-Mechanical-Bluetooth/dp/B082TQK2SB/ref=sr_1_5?crid=252LFZZHYLWDD&amp;amp;dib=eyJ2IjoiMSJ9.gyKtvGmLTXpI3swZKvqN1nA1_IpxN0tDr08V-ZBKHBmG1lD2gKTlEwNwkO4E9kuJXwi1xyyVHfl40hOiT_yN-pl5u__XxzMT5NhETtf5gDxP7fNbDTKpQfLR6yWiT_xAOII9MF-sFnUNT0skIHa6mX3_S9bqgac_NuKq2Os7PgVtEDF0dZ-JfTAtB281d0B9SfhQ_IbJ7wE8rjYmRHT8_Txvk7KcJKX2GvjIqiQnCrM.QEr4ATZDMt12kc9i0f5UwxTYd3RtKaMZSs8CND32-qs&amp;amp;dib_tag=se&amp;amp;keywords=happy+hacking+keyboard+hybrid+s&amp;amp;qid=1713811000&amp;amp;sprefix=happy+hacking%2Caps%2C90&amp;amp;sr=8-5&amp;amp;ufe=app_do%3Aamzn1.fos.335e368b-29e8-4542-bb58-939a88195e78"&gt;350 Euro&lt;/a&gt;, for a &lt;em&gt;keyboard&lt;/em&gt; (!) Asking Vitto multiple times: is this &lt;em&gt;really&lt;/em&gt; worth it? Should I go for it? I mean, that's a whole lot of dinners.&lt;/p&gt;
&lt;p&gt;After much hesitation, I did it. &lt;em&gt;All black&lt;/em&gt;, with blank key caps …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/65/top.png" alt="Top view hhkb" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There I was. Ready to click the big blue purchase button. &lt;a href="https://www.amazon.de/-/en/Keyboard-PD-KB800WS-Professional-Mechanical-Bluetooth/dp/B082TQK2SB/ref=sr_1_5?crid=252LFZZHYLWDD&amp;amp;dib=eyJ2IjoiMSJ9.gyKtvGmLTXpI3swZKvqN1nA1_IpxN0tDr08V-ZBKHBmG1lD2gKTlEwNwkO4E9kuJXwi1xyyVHfl40hOiT_yN-pl5u__XxzMT5NhETtf5gDxP7fNbDTKpQfLR6yWiT_xAOII9MF-sFnUNT0skIHa6mX3_S9bqgac_NuKq2Os7PgVtEDF0dZ-JfTAtB281d0B9SfhQ_IbJ7wE8rjYmRHT8_Txvk7KcJKX2GvjIqiQnCrM.QEr4ATZDMt12kc9i0f5UwxTYd3RtKaMZSs8CND32-qs&amp;amp;dib_tag=se&amp;amp;keywords=happy+hacking+keyboard+hybrid+s&amp;amp;qid=1713811000&amp;amp;sprefix=happy+hacking%2Caps%2C90&amp;amp;sr=8-5&amp;amp;ufe=app_do%3Aamzn1.fos.335e368b-29e8-4542-bb58-939a88195e78"&gt;350 Euro&lt;/a&gt;, for a &lt;em&gt;keyboard&lt;/em&gt; (!) Asking Vitto multiple times: is this &lt;em&gt;really&lt;/em&gt; worth it? Should I go for it? I mean, that's a whole lot of dinners.&lt;/p&gt;
&lt;p&gt;After much hesitation, I did it. &lt;em&gt;All black&lt;/em&gt;, with blank key caps. Yes, I too was skeptical from the start. The first days, I couldn't type 4 keys in a row without having to pull up the manual. 2 days passed, then 3 days passed... Still, I didn't get it. How do people use this keyboard? How am I supposed to type a full credit card number with blank key caps? You're a pro touch typist? Good for you. I'm not.&lt;/p&gt;
&lt;p&gt;"I can get used to this" I would tell myself. But after a week using the keyboard, I've had enough. Basta. This isn't going to cut it for me. I'm just too slow with it. 350 Euro.. I'm not only poorer but now I'm also &lt;em&gt;slower&lt;/em&gt; at typing.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/65/return.png" alt="Returning the HHKB keyboard" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I decided to return the thing&lt;sup id="sf-happy-hacking-keyboard-review-1-back"&gt;&lt;a href="#sf-happy-hacking-keyboard-review-1" class="simple-footnote" title="Don't buy from PFU EMEA, order from Amazon directly."&gt;1&lt;/a&gt;&lt;/sup&gt;. And went back to my good ol' &lt;a href="https://www.keychron.com/collections/normal-profile-keyboards/products/keychron-k2-hot-swappable-wireless-mechanical-keyboard"&gt;Keychron K2&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="a-second-chance"&gt;A second chance&lt;/h2&gt;
&lt;p&gt;Some days passed and something was just not right. This can't be! I can't be missing that piece of junk. There's just something about those keys.. Typing on that keyboard just felt &lt;em&gt;right&lt;/em&gt;. Even if I didn't know where each key was.. I just feels good!&lt;/p&gt;
&lt;p&gt;I reordered it. And we were not about to make the same 'blank key caps' mistake. I ordered the white version with printed key caps. Just to make sure I CAN SEE EVERYTHING.&lt;/p&gt;
&lt;h2 id="some-cons"&gt;Some cons&lt;/h2&gt;
&lt;p&gt;It's perfect, but there are still some things I'm not in love with. For 350 EUR I expect everything to be absolutely perfect. I mean, &lt;em&gt;really&lt;/em&gt; perfect. The materials are good, but for the most, they are still plastic. Every time I want to start using the keyboard, I need to click and hold the little power button in the back.&lt;/p&gt;
&lt;p&gt;Another small complaint is the layout. This keyboard changes the place of the Control key, removes the delete key, has no dedicated arrow keys. For the most part, it works great (I use Vim keybindings for most things). But now my brain goes through a small hiccup every time I use a 'normal' keyboard. And my pinky still hurts. Maybe my hands are just big, I don't know.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/65/setup.png" alt="Setup view happy hacking keyboard" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2 id="what-made-me-fall-in-love"&gt;What made me fall in love&lt;/h2&gt;
&lt;p&gt;It was the typing experience. It’s really like typing on a cloud. I'm not a big mechanical keyboard head. I've had a couple of &lt;a href="https://www.keychron.com/"&gt;Keychron&lt;/a&gt; keyboards, that's it. But these &lt;a href="https://deskthority.net/wiki/Topre_switch"&gt;Topre switches&lt;/a&gt; are just something else. They are smooth and squishy at the same time. It feels as though your fingers are just swimming in the keyboard. &lt;em&gt;That's&lt;/em&gt; what got me hooked from the start.&lt;/p&gt;
&lt;p&gt;There are two other things I love about the &lt;a href="https://www.amazon.de/-/en/Keyboard-PD-KB800WS-Professional-Mechanical-Bluetooth/dp/B082TQK2SB/ref=sr_1_5?crid=252LFZZHYLWDD&amp;amp;dib=eyJ2IjoiMSJ9.gyKtvGmLTXpI3swZKvqN1nA1_IpxN0tDr08V-ZBKHBmG1lD2gKTlEwNwkO4E9kuJXwi1xyyVHfl40hOiT_yN-pl5u__XxzMT5NhETtf5gDxP7fNbDTKpQfLR6yWiT_xAOII9MF-sFnUNT0skIHa6mX3_S9bqgac_NuKq2Os7PgVtEDF0dZ-JfTAtB281d0B9SfhQ_IbJ7wE8rjYmRHT8_Txvk7KcJKX2GvjIqiQnCrM.QEr4ATZDMt12kc9i0f5UwxTYd3RtKaMZSs8CND32-qs&amp;amp;dib_tag=se&amp;amp;keywords=happy+hacking+keyboard+hybrid+s&amp;amp;qid=1713811000&amp;amp;sprefix=happy+hacking%2Caps%2C90&amp;amp;sr=8-5&amp;amp;ufe=app_do%3Aamzn1.fos.335e368b-29e8-4542-bb58-939a88195e78"&gt;Happy Hacking Keyboard&lt;/a&gt;. First, is the battery. I've had this keyboard for over a month, and did not think about battery a single time. I’ve come to realize just how much I hate charging my keyboard every week. I prefer to use a couple of AA batteries every couple of months.&lt;/p&gt;
&lt;p&gt;The second thing is the sound. I like mechanical - but I don't enjoy everyone around the house hearing the hammering sound of plastic against plastic, or Vitto having to know every time I write to file in Neovim. I enjoy working around people sometimes. With the HHKB I can do so without bothering everyone around me.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;So yeah. &lt;a href="https://gizmodo.com/this-is-the-perfect-keyboard-1845258727"&gt;I have to agree with Alex&lt;/a&gt;. This might very well be the perfect keyboard. I'm not 100% sold on all the choices that were made, but I can see &lt;em&gt;why&lt;/em&gt; they were made. And the typing experience got me hooked from the first day (or at least once I knew where all the keys were).&lt;/p&gt;
&lt;p&gt;There is only one last thing that will truly tell if this is the perfect keyboard: Time. And even though I have good trust on &lt;a href="https://en.wikipedia.org/wiki/Happy_Hacking_Keyboard"&gt;Japanese manufacturing&lt;/a&gt;, we'll only know the result of that test in a decade.&lt;/p&gt;
&lt;hr&gt;&lt;ol class="simple-footnotes"&gt;&lt;li id="sf-happy-hacking-keyboard-review-1"&gt;Don't buy from &lt;a href="https://www.pfuemea.com/en-gb"&gt;PFU EMEA&lt;/a&gt;, order from Amazon directly. &lt;a href="#sf-happy-hacking-keyboard-review-1-back" class="simple-footnote-back"&gt;↩&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content><category term="blog"/></entry><entry><title>An experiment with Gemma 2B and a Portuguese classic</title><link href="https://lusiaidas.duarteocarmo.com/" rel="alternate"/><published>2024-04-10T20:00:00+02:00</published><updated>2024-04-10T20:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:lusiaidas.duarteocarmo.com,2024-04-10:/</id><content type="html"/><category term="blog"/></entry><entry><title>NewsHavn: Danish news, in English</title><link href="https://duarteocarmo.com/blog/newshavn-danish-news-in-english.html" rel="alternate"/><published>2024-02-26T20:30:00+01:00</published><updated>2024-02-26T20:30:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-02-26:/blog/newshavn-danish-news-in-english.html</id><summary type="html">&lt;p&gt;&lt;em&gt;Conversas de café&lt;/em&gt;. Literally translated, means "coffee shop conversations".&lt;/p&gt;
&lt;p&gt;An upcoming election, the weather next week, a corruption scandal, a new policy. Just some examples of &lt;em&gt;Conversas de café&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As an expat living in Denmark for the past 7 years (without speaking the language), that's perhaps one of the things …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;em&gt;Conversas de café&lt;/em&gt;. Literally translated, means "coffee shop conversations".&lt;/p&gt;
&lt;p&gt;An upcoming election, the weather next week, a corruption scandal, a new policy. Just some examples of &lt;em&gt;Conversas de café&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As an expat living in Denmark for the past 7 years (without speaking the language), that's perhaps one of the things I miss most. Maybe it's because of where I'm from, but for me, water cooler conversations &lt;em&gt;are&lt;/em&gt; Culture.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Learn the language&lt;/em&gt;, &lt;em&gt;use Google Translate&lt;/em&gt;, &lt;em&gt;read &lt;a href="https://www.thelocal.dk/"&gt;some&lt;/a&gt; expat focused website&lt;/em&gt;. No thanks. I want to read the same stuff people around me read. I want to understand what worries &lt;em&gt;them&lt;/em&gt;, not other expats.&lt;/p&gt;
&lt;p&gt;Enter &lt;a href="https://newshavn.duarteocarmo.com/"&gt;NewsHavn&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;a href="https://newshavn.duarteocarmo.com" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/63/website.png" alt="Newshavn.duarteocarmo.com"
style="max-width:100%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Newshavn parses the RSS feeds of a couple of big Danish newspapers and uses &lt;a href="https://github.com/duarteocarmo/NewsHavn/blob/8758d1f8214e2c54c24c06d0c0ba42b92e78c474/parser/parse.go#L173"&gt;Mistral 7B&lt;/a&gt; to translate them. The design is inspired by &lt;a href="https://text.npr.org/"&gt;NPR&lt;/a&gt;. The code is all &lt;a href="https://github.com/duarteocarmo/NewsHavn"&gt;opensource&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For this one, I decided to use Go (again). I must say, Go is growing on me. Rust is fun and efficient, but I feel like Go strikes the right balance between strictness and flexibility. Another thing I love about Go is its concurrency model. &lt;a href="https://gobyexample.com/goroutines"&gt;Goroutines&lt;/a&gt; are just beautiful, and make me want to write more concurrent code. It feels like it's a no-bullshit language, and I'm a fan of no-bullshit.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>mistral-doc: Fine-tuning an LLM on my ChatGPT conversations</title><link href="https://duarteocarmo.com/blog/mistral-doc-fine-tuning-an-llm-on-my-chatgpt-conversations.html" rel="alternate"/><published>2024-02-09T13:00:00+01:00</published><updated>2024-02-09T13:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-02-09:/blog/mistral-doc-fine-tuning-an-llm-on-my-chatgpt-conversations.html</id><summary type="html">&lt;p&gt;&lt;i&gt;&lt;a href="https://github.com/duarteocarmo/mistral-doc" target="_blank"&gt;Step-by-step instructions to do it yourself&lt;/a&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;If you ever read this blog, you probably already know by now. For the past couple of months, I've been trying to create my &lt;em&gt;very own&lt;/em&gt; Large Language Model.&lt;/p&gt;
&lt;p&gt;OpenAI's models have had a significant impact on my productivity. But something still bothers me …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;i&gt;&lt;a href="https://github.com/duarteocarmo/mistral-doc" target="_blank"&gt;Step-by-step instructions to do it yourself&lt;/a&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;If you ever read this blog, you probably already know by now. For the past couple of months, I've been trying to create my &lt;em&gt;very own&lt;/em&gt; Large Language Model.&lt;/p&gt;
&lt;p&gt;OpenAI's models have had a significant impact on my productivity. But something still bothers me. My data, my preferences, my history, my concerns, all going into the hands of a single company.&lt;/p&gt;
&lt;p&gt;But &lt;em&gt;this&lt;/em&gt; time, &lt;em&gt;this&lt;/em&gt; time I think I got really close! The result is &lt;code&gt;mistral-doc&lt;/code&gt;, a Mistral fine-tune on all my ChatGPT conversations.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/62/ollama.png" alt="Ollama SwiftUI with mistral-doc running"
style="max-width:100%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;The goal is to emulate my current ChatGPT plus experience. My favorite thing about GPT4, is how obedient it is to &lt;a href="https://github.com/duarteocarmo/mistral-doc/blob/846bff73cdf065620d529b2024e7cfba774dc0a5/Modelfile#L24"&gt;my system prompt&lt;/a&gt;. &lt;em&gt;Especially&lt;/em&gt; when I just want a short and concise answer. &lt;a href="https://chat.openai.com/share/84a98280-ac66-41b4-9158-cbccc149cbdd"&gt;When I ask who was the first king of Portugal&lt;/a&gt;, I want the answer to be short (and sure, correct is also good).&lt;/p&gt;
&lt;p&gt;Over the past year, I've accumulated a little over 1000(!) conversations with ChatGPT. Thankfully, OpenAI actually makes exporting all my data pretty straightforward. So, I built a small &lt;a href="https://github.com/duarteocarmo/mistral-doc/blob/main/process_gpt_export.py"&gt;script&lt;/a&gt; that could process the export, and store all of my conversations as a Hugging Face dataset. It's private, of course, but you can use the same script to build yours.&lt;/p&gt;
&lt;p&gt;There are thousands of models I could potentially fine-tune on my conversations. And the model choice will &lt;em&gt;for sure&lt;/em&gt;, have an impact on the quality of the results. But I decided to go with something small, so that I could train everything in about ~2 hours with &lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl"&gt;Axolotl&lt;/a&gt;. I decided to go with &lt;code&gt;Mistral-7B-Instruct-v0.2&lt;/code&gt;. Yes, a base model would likely learn a bit more. But I don't want to teach a model to chat, I want to teach a model to learn from &lt;em&gt;my&lt;/em&gt; chats.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/62/merged.png" alt="Mistral-doc instruct merged on hugging face"
style="max-width:100%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;With Axolotl and a 40GB/RAM GPU machine in runpod, the fine-tuning process took around 2.5 hours. Here's the &lt;a href="https://github.com/duarteocarmo/mistral-doc/blob/main/configs/mistral-doc-instruct.yml"&gt;config&lt;/a&gt; I used by the way. Once the training was done, I had to &lt;em&gt;merge the model back to base&lt;/em&gt; (we trained a &lt;a href="https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms"&gt;Lora adapter&lt;/a&gt;, but want the whole thing). Thankfully, Axolotl also &lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl?tab=readme-ov-file#merge-lora-to-base"&gt;handles this&lt;/a&gt;. In the end of the process, I successfully had my entire ~15 GB model in a repository I called &lt;code&gt;mistral-doc-instruct-v4-merged&lt;/code&gt;. Yes, there were 3 other versions where I screwed things up.&lt;/p&gt;
&lt;p&gt;With the entire fine-tuned model ready, it was now time to run it on my machine. Before I did that, I had to ensure the model was going to be of a reasonable size and speed. I used &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; to first convert the model to the &lt;code&gt;GGUF&lt;/code&gt; format, and then to &lt;a href="https://sebastianraschka.com/blog/2023/llm-mixed-precision-copy.html#quantization"&gt;quantize&lt;/a&gt; it to 4-bits. This process resulted in a &lt;em&gt;single&lt;/em&gt; file, called &lt;code&gt;mistral-doc.gguf&lt;/code&gt;, which is ~4.5 GB in size.&lt;/p&gt;
&lt;p&gt;Now we can take for a spin! Using &lt;a href="https://ollama.ai/"&gt;Ollama&lt;/a&gt;, I can create a &lt;a href="https://github.com/duarteocarmo/mistral-doc/blob/main/Modelfile"&gt;&lt;code&gt;Modelfile&lt;/code&gt;&lt;/a&gt; with all the details and configuration I need. With that file, I can now create my own model in the Ollama format with:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# create the model with ollama&lt;/span&gt;

$&lt;span class="w"&gt; &lt;/span&gt;ollama&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;mistral-doc&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;./Modelfile

&lt;span class="c1"&gt;# chat with your fine tune&lt;/span&gt;

$&lt;span class="w"&gt; &lt;/span&gt;ollama&lt;span class="w"&gt; &lt;/span&gt;run&lt;span class="w"&gt; &lt;/span&gt;mistral-doc
&amp;gt;&amp;gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;vv&lt;span class="w"&gt; &lt;/span&gt;jasmin&lt;span class="w"&gt; &lt;/span&gt;or&lt;span class="w"&gt; &lt;/span&gt;basmati&lt;span class="w"&gt; &lt;/span&gt;rice&lt;span class="w"&gt; &lt;/span&gt;with&lt;span class="w"&gt; &lt;/span&gt;falafel?
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running the model in the terminal is all fine and dandy, but I want something that can actually replace the instinct of going into a browser and typing &lt;code&gt;chat.openai.com&lt;/code&gt;. I downloaded a small app called &lt;a href="https://github.com/kghandour/Ollama-SwiftUI"&gt;&lt;code&gt;Ollama-SwiftUI&lt;/code&gt;&lt;/a&gt; that lets me run my model in a nice little Mac app. Currently, I'm jumping around between using that, and &lt;a href="https://lmstudio.ai/"&gt;LM Studio&lt;/a&gt; to interact with &lt;code&gt;mistral-doc&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/62/lmstudio.png" alt="Mistral doc in action in LMStudio"
style="max-width:100%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;How has the experience been so far? Well for starters, its much better than any off-the-shelf ~7 billion parameter model I've tried before. For simple and straightforward tasks, it's a great alternative to ChatGPT Plus. I should note that it's not better because it's smarter or does math better. This model is better because it has learned how &lt;em&gt;I&lt;/em&gt; like questions to be answered. This of course does &lt;em&gt;not&lt;/em&gt; mean you will like it, which is why I'm not going to publish it. We could make it even better! I could fine-tune something like &lt;a href="https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1"&gt;Mixtral&lt;/a&gt;, which would surely make it even more powerful! But I would probably not be able to run it on my laptop, and would have to pay someone else to host it. I could also generate even more data, increase the number of epochs, teach it more things about me.&lt;/p&gt;
&lt;p&gt;Not sure I will; but this process definitely looks promising.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>What's in my NOW?</title><link href="https://kk.org/cooltools/whats-in-my-now-duarte-o-carmo/" rel="alternate"/><published>2024-02-01T10:00:00+01:00</published><updated>2024-02-01T10:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:kk.org,2024-02-01:/cooltools/whats-in-my-now-duarte-o-carmo/</id><content type="html"/><category term="blog"/></entry><entry><title>Self-hosting my personal LLM (but not quite)</title><link href="https://duarteocarmo.com/blog/self-hosting-llm-ambrosio.html" rel="alternate"/><published>2024-01-21T22:35:00+01:00</published><updated>2024-01-21T22:35:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-01-21:/blog/self-hosting-llm-ambrosio.html</id><summary type="html">&lt;p&gt;ChatGPT is now part of our daily lives. A quick question, an extra input, some quick feedback, I always reach for it. AGI or not AGI, I certainly can't deny the impact it has had on our lives. It's pretty incredible!&lt;/p&gt;
&lt;p&gt;But that small voice just won't go away. &lt;em&gt;"Where …&lt;/em&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;ChatGPT is now part of our daily lives. A quick question, an extra input, some quick feedback, I always reach for it. AGI or not AGI, I certainly can't deny the impact it has had on our lives. It's pretty incredible!&lt;/p&gt;
&lt;p&gt;But that small voice just won't go away. &lt;em&gt;"Where is all this data going? Should I really be telling this bot this much? What if someone else sees this?"&lt;/em&gt;
In a way, it feels like everything I tell this bot is going into a void I have no control of. I can't be the only one feeling it.&lt;/p&gt;
&lt;p&gt;And I don't like it. It's a bit like putting all my eggs in a single basket. As engineers, we know how bad having a single point of failure is. Resilience! That's what we were taught.&lt;/p&gt;
&lt;p&gt;Given that I already started building &lt;a href="https://github.com/duarteocarmo/ambrosio"&gt;Ambrosio&lt;/a&gt;, so why not use it to self-host my own LLM?&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;a href="https://duarteocarmo.com/images/60/ambrosio.png" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/60/ambrosio.png" alt="Ambrosio chat and generate photo" style="max-width:95%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="self-hosting-with-the-help-of-llamacpp"&gt;Self-hosting with the help of Llama.cpp&lt;/h2&gt;
&lt;p&gt;I'm not really interested in spending 500 USD/month to run a model that requires 40 GB of VRAM on a large GPU. So I had to lower my standards a little bit. The premise was now: What models can I run on a 15 USD/month server?&lt;/p&gt;
&lt;p&gt;I started looking at the &lt;a href="https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard"&gt;only LLM leaderboard I trust&lt;/a&gt;. There aren't many options with 7 billion parameters. The best one seems to be &lt;a href="https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha"&gt;Starling-LM-7B-alpha&lt;/a&gt;, out of Berkeley. &lt;a href="https://huggingface.co/TheBloke/Starling-LM-7B-alpha-GGUF#provided-files"&gt;Quantized&lt;/a&gt;, it should just about run on a cheap Hetzner box. Getting it up and running on a small box with &lt;a href="https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md"&gt;llama.cpp&lt;/a&gt; is as easy as:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# download model&lt;/span&gt;

$&lt;span class="w"&gt; &lt;/span&gt;wget&lt;span class="w"&gt; &lt;/span&gt;https://huggingface.co/TheBloke/Starling-LM-7B-alpha-GGUF/resolve/main/starling-lm-7b-alpha.Q5_K_M.gguf?download&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c1"&gt;# setup llama cpp API with ./server&lt;/span&gt;

$&lt;span class="w"&gt; &lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/ggerganov/llama.cpp
$&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;llama.cpp
$&lt;span class="w"&gt; &lt;/span&gt;make
$&lt;span class="w"&gt; &lt;/span&gt;./server&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;starling-lm-7b-alpha.Q5_K_M.gguf&lt;span class="w"&gt; &lt;/span&gt;-c&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;8192&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--temp&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0&lt;span class="w"&gt; &lt;/span&gt;--repeat_penalty&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;.1&lt;span class="w"&gt; &lt;/span&gt;-n&lt;span class="w"&gt; &lt;/span&gt;-1&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;GPT4 User: {prompt}&amp;lt;|end_of_turn|&amp;gt;GPT4 Assistant:&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And just like that, you're running your own OpenAI compatible (&lt;em&gt;ugh&lt;/em&gt;, I hate that term) API serving that very 7B model.&lt;/p&gt;
&lt;p&gt;Now, don't get me wrong, I think this model is great! But I'm looking for something that will replace my daily use of ChatGPT. After integrating with &lt;a href="https://github.com/duarteocarmo/ambrosio"&gt;Ambrosio&lt;/a&gt; the limitations were obvious. The first issue: It's slow, painfully slow (which was expected on my small machine). And even though it's pretty incredible for a small model, it's not going to replace even the most basic use cases.&lt;/p&gt;
&lt;p&gt;And if it sounds unfair in any way, it's because it is! How could I expect the same level of quality from something that can run on a cheap VM, to something like ChatGPT? And you're right. We can't. So what else can we do?&lt;/p&gt;
&lt;h2 id="mixtral-8x7b-and-the-rice-test"&gt;Mixtral 8x7B and the rice test&lt;/h2&gt;
&lt;p&gt;Let's say I was going to rent a bigger machine for a better model. What would I run on it? Certainly not a 7B model! Looking at the rankings again, the best open source thing currently is &lt;a href="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF"&gt;Mixtral 8x7b&lt;/a&gt;, a mixture-of-experts model from &lt;a href="https://mistral.ai/"&gt;Mistral&lt;/a&gt;. Just glancing at the &lt;a href="https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF#provided-files"&gt;requirements&lt;/a&gt;, and it was pretty clear this needed to be run on someone else's box. I decided to give &lt;a href="https://www.together.ai"&gt;together.ai&lt;/a&gt; a go. Their pricing is pretty much unbeatable and their inference speed is &lt;em&gt;blazingly&lt;/em&gt; fast. And of course they have an &lt;a href="https://docs.together.ai/docs/openai-api-compatibility"&gt;Open AI compatible API&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;a href="https://duarteocarmo.com/images/60/ambrosio_vs_chatgpt.png" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/60/ambrosio_vs_chatgpt.png" alt="Ambrosio vs. ChatGPT vs. GPT-4" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;figcaption&gt; &lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Cool, it's fast and cheap. Is it any good?&lt;/p&gt;
&lt;p&gt;Everyone has their own way of testing the quality of a model. Mine is what I call the  &lt;em&gt;"rice test"&lt;/em&gt;. &lt;a href="https://github.com/duarteocarmo/ambrosio/blob/master/prompts/system.txt"&gt;My system prompt&lt;/a&gt;, stolen from Jeremy Howard contains the following clause:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;if the request begins with the string &amp;quot;vv&amp;quot; [...] make your response as concise as possible
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The question above is a simple and straightforward scenario. I'm at the supermarket, I need to buy some rice to cook with my falafel. I open my phone and: "vv jasmin or basmati rice with falafel?". As you can see above, while GPT-4 gives me &lt;em&gt;exactly&lt;/em&gt; what I need, we can't say the same for Mixtral or GPT-3.5. I just want a quick and dirty answer, I don't want something complicated, I certainly do not want any Python code. So yeah, it doesn't quite pass the jasmin rice test. But we're getting closer. And I will keep testing it in the future.&lt;/p&gt;
&lt;p&gt;And sure, Mixtral may &lt;a href="https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1"&gt;not have been&lt;/a&gt;  trained with a system prompt, but it should, &lt;a href="https://web.archive.org/web/20231030013339/https://docs.mistral.ai/usage/guardrailing/#appendix"&gt;in principle&lt;/a&gt;, support it.&lt;/p&gt;
&lt;h2 id="thoughts-and-taking-ambrosio-further"&gt;Thoughts and taking Ambrosio further&lt;/h2&gt;
&lt;p&gt;Since I ended up using together's API, I decided to also add image generation to Ambrosio (something like I have with DALL·E 2). Together supports most Stable Diffusion models. My feelings about them are largely the same as my feelings about Mixtral and GPT-4. The open source models are good! But they are also a bit rough, and don't come quite close to what Open AI is giving us. At least yet.&lt;/p&gt;
&lt;p&gt;So, will Ambrosio &lt;em&gt;completely&lt;/em&gt; replace my use of GPT-4? Will I cancel my ChatGPT plus subscription? Not really. At least, &lt;em&gt;not yet&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Every time I use these open source models, I feel like I'm missing out on a better answer. In a way, I've become spoiled by GPT-4. It just grasps what I want in a better way, and most importantly, it passes the rice test with flying colors!&lt;/p&gt;
&lt;p&gt;But the truth also is, that for most tasks, Mixtral is good enough. And it's a big leap forward from what the open source community previously had. I can only imagine what comes next.&lt;/p&gt;
&lt;p&gt;Exciting times.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Rebuilding /photos</title><link href="https://duarteocarmo.com/blog/rebuilding-photos.html" rel="alternate"/><published>2024-01-02T22:35:00+01:00</published><updated>2024-01-02T22:35:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2024-01-02:/blog/rebuilding-photos.html</id><summary type="html">&lt;p&gt;A couple of years ago, I decided to remove all my photos from Instagram. I wanted something for myself. Something that suited what I needed. The result was &lt;a href="/photos.html"&gt;/photos&lt;/a&gt;. I wrote about it when I built it &lt;a href="https://duarteocarmo.com/blog/self-hosting-instagram-python.html"&gt;too&lt;/a&gt;. Here's a snippet from that post:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;[...] Uploading is by no means as …&lt;/em&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;A couple of years ago, I decided to remove all my photos from Instagram. I wanted something for myself. Something that suited what I needed. The result was &lt;a href="/photos.html"&gt;/photos&lt;/a&gt;. I wrote about it when I built it &lt;a href="https://duarteocarmo.com/blog/self-hosting-instagram-python.html"&gt;too&lt;/a&gt;. Here's a snippet from that post:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;[...] Uploading is by no means as simple as opening an app and snapping a picture. But I've created a small Python script that processes a photo, generates a thumbnail, asks some questions about it, and uploads it to S3 and my blog. And that's good enough [...]&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It wasn't good enough. I guess some &lt;a href="https://course.ccs.neu.edu/cs5500f14/Notes/Prototyping1/planToThrowOneAway.html"&gt;principles&lt;/a&gt; are just timeless. It needed to be thrown away.&lt;/p&gt;
&lt;h2 id="one-to-throw-away"&gt;One to throw away&lt;/h2&gt;
&lt;p&gt;Quickly after making it, it was pretty obvious it was a solution to throw into the trash. In 2 years, I published a little over 10 photos. It's not that I want to publish thousands of photos, but I've always enjoyed taking and sharing them. But that whole run a script thing was just not working for me.&lt;/p&gt;
&lt;p&gt;In a way, it's a lot like writing. Add a little bit of friction to the process, and nobody will use it at all. The script I originally built quickly got lost and stopped working. Whenever I wanted to share a photo, my laptop wasn't around. When the laptop was around, I wasn't thinking of sharing photos. So no, not good enough.&lt;/p&gt;
&lt;p&gt;Whatever I was going to build to replace the old system, needed to do one thing really well: get out of the way.&lt;/p&gt;
&lt;p&gt;I thought about it a bunch. An API? Should I interact with that API via shortcuts? Should I just build an entire front-end? What if I give &lt;a href="https://leptos.dev/"&gt;leptos&lt;/a&gt; a shot? Should I rclone a google photos album? It can't be that hard. I needed something that got out of the way.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/59/ambrosio.png" alt="Ambrosio create photo" style="max-width:95%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="ambrosio"&gt;Ambrosio&lt;/h2&gt;
&lt;p&gt;I needed something that was already part of my routine. The result is &lt;a href="https://github.com/duarteocarmo/ambrosio"&gt;Ambrosio&lt;/a&gt;. Some will recognize the &lt;a href="https://youtu.be/oSKi309VnG8?si=92t2m6KNgsX092FX&amp;amp;t=9"&gt;name&lt;/a&gt;. Ambrosio is a Telegram bot that was designed to be my personal assistant.&lt;/p&gt;
&lt;p&gt;I designed Ambrosio to be able to have different &lt;a href="https://github.com/duarteocarmo/ambrosio/tree/master/modes"&gt;&lt;em&gt;modes&lt;/em&gt;&lt;/a&gt;. The first one is the &lt;code&gt;photo&lt;/code&gt; mode. It's responsible for performing &lt;a href="https://www.crowdstrike.com/cybersecurity-101/observability/crud/"&gt;CRUD&lt;/a&gt; operations on an &lt;a href="https://www.cloudflare.com/developer-platform/r2/"&gt;R2 bucket&lt;/a&gt;. To maintain the speed of /photos, it also generates a &lt;code&gt;webp&lt;/code&gt; thumbnail with reduced quality. If needed, it triggers a rebuild of my website using a &lt;a href="https://developers.cloudflare.com/pages/configuration/deploy-hooks/"&gt;deploy hook&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Instead of having to create a markdown page for every new photo, I've built another small pelican &lt;a href="https://github.com/duarteocarmo/duarteocarmo.com/blob/master/plugins/photos/photos.py"&gt;plugin&lt;/a&gt;. Before building the website, it downloads all the metadata for each photo, and generates the pages that it needs to. So that website builds remain fast on my local machine, I also added disk caching. Remember: get out of the way!&lt;/p&gt;
&lt;h2 id="on-go"&gt;On Go&lt;/h2&gt;
&lt;p&gt;I decided to build Ambrosio using something I've never used before: Go. I've tinkered with a couple of code bases but never really used it to build something from scratch. The idea of something not as strict as Rust, but also not as flexible as Python really sparked my curiosity.&lt;/p&gt;
&lt;p&gt;Go gives me mixed feelings, but Go gives me hope. It is strict, but not &lt;em&gt;too&lt;/em&gt; strict. It gives me the confidence when writing code that Python has never given me. And gives me guarantees that if the compiler does not complain, things will mostly work fine. There are some things I still don't understand about Go. The first is error handling. Having a bunch if &lt;code&gt;if err == nil&lt;/code&gt; throughout the code base is not particularly nice to look at. Also hesitant about the whole one letter variable names thing. Things tend to become cryptic.&lt;/p&gt;
&lt;p&gt;It's fun how programming languages all start looking like the same the more you use them. For now, I'm curious to see how my opinion about Go evolves with time.&lt;/p&gt;
&lt;p&gt;One thing is clear though: Whatever technology you use, whatever framework you go with, technology is beautiful when it &lt;em&gt;gets out of the way&lt;/em&gt;.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>You can now listen to this blog</title><link href="https://duarteocarmo.com/blog/you-can-now-listen-to-this-blog.html" rel="alternate"/><published>2023-12-08T05:00:00+01:00</published><updated>2023-12-08T05:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-12-08:/blog/you-can-now-listen-to-this-blog.html</id><summary type="html">&lt;p&gt;One of my favorite &lt;a href="https://www.publico.pt/autor/joao-miguel-tavares"&gt;Portuguese columnists&lt;/a&gt; has this weird thing about his column. Maybe it's more common than I thought. For every piece he publishes, he also publishes a podcast version along with it.&lt;/p&gt;
&lt;p&gt;Now, either &lt;a href="https://www.publico.pt/"&gt;Publico&lt;/a&gt; has 27th century text-to-speech (TTS) technology, or he's &lt;em&gt;actually&lt;/em&gt; reading them. I don't …&lt;/p&gt;</summary><content type="html">&lt;p&gt;One of my favorite &lt;a href="https://www.publico.pt/autor/joao-miguel-tavares"&gt;Portuguese columnists&lt;/a&gt; has this weird thing about his column. Maybe it's more common than I thought. For every piece he publishes, he also publishes a podcast version along with it.&lt;/p&gt;
&lt;p&gt;Now, either &lt;a href="https://www.publico.pt/"&gt;Publico&lt;/a&gt; has 27th century text-to-speech (TTS) technology, or he's &lt;em&gt;actually&lt;/em&gt; reading them. I don't have a problem with that, but I'm &lt;em&gt;pretty&lt;/em&gt; sure we could automate that part of the process with today's tech.&lt;/p&gt;
&lt;p&gt;And yeah, I've heard all the rage about voice cloning services like &lt;a href="https://elevenlabs.io/pricing"&gt;ElevenLabs&lt;/a&gt;. But if you've been following this blog for a while, you probably guessed that we're not just gonna use an API. We're probably gonna build one from scratch.&lt;/p&gt;
&lt;h2 id="an-engine"&gt;An engine&lt;/h2&gt;
&lt;p&gt;The premise did not appear simple to build, but was easy to understand. Something that transcribes every new article of this blog using &lt;em&gt;my own&lt;/em&gt; voice. It needs to be cheap, seamless, and most importantly, not get in the way. Writing is &lt;em&gt;enough&lt;/em&gt; work as is.&lt;/p&gt;
&lt;p&gt;The result is &lt;a href="https://github.com/duarteocarmo/podcaster/"&gt;podcaster&lt;/a&gt;. It runs 100% on GitHub Actions, it scans every new blog post in my RSS feed and uses &lt;a href="https://huggingface.co/coqui/XTTS-v2"&gt;XTTS-v2&lt;/a&gt; to transcribe it. The only thing it needs from me is a 1-min audio file. I tried &lt;a href="https://tts.readthedocs.io/en/latest/models/bark.html"&gt;Bark&lt;/a&gt; and a couple of other models, but this was the only one that made Vittoria come into the room when I was testing things around.&lt;/p&gt;
&lt;p&gt;It's not that I hate infrastructure, I just wanted the whole thing to run on CI. All these TTS models are slow when running on the CPU. But instead of embarking on another painful journey through the world of GPU computing, I found &lt;a href="https://modal.com"&gt;Modal&lt;/a&gt;. I'm happy to report that I've regained faith in the future of serverless GPUs:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# define the image&lt;/span&gt;

&lt;span class="n"&gt;MODAL_IMAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;debian_slim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pip_install_from_pyproject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pyproject.toml&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apt_install&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;ffmpeg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;stub&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;modal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;modal-app&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODAL_IMAGE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create the function&lt;/span&gt;

&lt;span class="nd"&gt;@stub&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;any&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transcribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ParsedArticle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;voice_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VOICE_FILE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LANGUAGE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# this part is executed in a GPU powered machine&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The engine stores all transcripts in S3, automatically generates a podcast feed from them (using &lt;a href="https://feedgen.kiesow.be/"&gt;feedgen&lt;/a&gt;), and uses web hooks to trigger a new build of this blog. Again, pretty much for free. If you're curious or interested in taking it for a spin, &lt;a href="https://github.com/duarteocarmo/podcaster#readme"&gt;should be straightforward&lt;/a&gt; to get started.&lt;/p&gt;
&lt;h2 id="integrating-with-pelican"&gt;Integrating with Pelican&lt;/h2&gt;
&lt;p&gt;As I said before: I really &lt;em&gt;don't&lt;/em&gt; like adding friction to my writing. It's already hard as is! The challenge then was to figure a way of automagically updating the blog with available transcripts after I publish, without me having to do anything at all.&lt;/p&gt;
&lt;p&gt;Fortunately, &lt;a href="https://justinmayer.com/about/"&gt;Justin&lt;/a&gt; besides being great company at PyCon Italia every year, has also built a pretty &lt;a href="https://docs.getpelican.com/en/latest/plugins.html"&gt;robust plugin system for Pelican&lt;/a&gt;. All I had to do, was to add a &lt;a href="https://github.com/duarteocarmo/duarteocarmo.com/blob/master/plugins/podcast/podcast.py"&gt;&lt;code&gt;podcaster&lt;/code&gt; plugin&lt;/a&gt; to this website. The plugin automatically matches articles to corresponding episodes, and adds that short html snippet you're seeing above. Should be &lt;em&gt;build once&lt;/em&gt; and let run. Hopefully at least.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;I had a lot of fun building this, and also learned a lot. First, it demystified the whole podcast hosting thing for me. Turns out, it's just a bunch of mp3 files in a bucket with &lt;a href="https://podcasts.apple.com/dk/podcast/duarte-o-carmos-articles/id1719493997"&gt;a rss feed&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Often I've settled for running ML models on CPU just because deploying with a GPU was much more of a pain, and didn't add any happiness to the process. At least for now, GPUs are here to stay, hopefully so are services like &lt;a href="https://modal.com/"&gt;Modal&lt;/a&gt;. Pythonic, easy to use, and easy to isolate from the rest of the code base.&lt;/p&gt;
&lt;p&gt;Finally, I was very impressed with the state of TTS and voice cloning technology. Is it perfect? No. Does it have some artifacts? Yes. Does it sound like a robot sometimes? Sure. But remember, I &lt;em&gt;only gave it a minute&lt;/em&gt; of my voice.&lt;/p&gt;
&lt;p&gt;Open source never ceases to amaze me.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Should you go into management consulting? Maybe.</title><link href="https://duarteocarmo.com/blog/management-consulting.html" rel="alternate"/><published>2023-11-08T08:00:00+01:00</published><updated>2023-11-08T08:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-11-08:/blog/management-consulting.html</id><summary type="html">&lt;p&gt;A couple of days ago, someone shared &lt;a href="https://www.youtube.com/watch?v=AiOUojVd6xQ"&gt;this&lt;/a&gt; John Oliver video with me. In it, he shows all the ways McKinsey is a terrible organization. Not a minute in and I was already thinking about that management consulting blog post I &lt;em&gt;always&lt;/em&gt; wanted to write.&lt;/p&gt;
&lt;p&gt;When I talk with people …&lt;/p&gt;</summary><content type="html">&lt;p&gt;A couple of days ago, someone shared &lt;a href="https://www.youtube.com/watch?v=AiOUojVd6xQ"&gt;this&lt;/a&gt; John Oliver video with me. In it, he shows all the ways McKinsey is a terrible organization. Not a minute in and I was already thinking about that management consulting blog post I &lt;em&gt;always&lt;/em&gt; wanted to write.&lt;/p&gt;
&lt;p&gt;When I talk with people that just finished University a lot of them wonder if it's the right place to go. &lt;em&gt;Of course&lt;/em&gt;, by the tasty recruiting videos, it looks like the absolute perfect place for your &lt;em&gt;typical&lt;/em&gt; graduate. The reality is a &lt;em&gt;bit&lt;/em&gt; more nuanced.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Attention&lt;/em&gt; to detail. That was the thing that struck me first. From manically aligning elements on a slide, to someone going meticulously through my email signature to ensure everything is as it should be. If you're the type that struggles with detail you'll probably suffer a bit. Truth is, attention to detail is generally something most working environments admire, and strive for (or maybe they should).&lt;/p&gt;
&lt;p&gt;In a way, management consulting feels like an extension of a University. Everything is &lt;em&gt;pre&lt;/em&gt; structured, everything is &lt;em&gt;pre&lt;/em&gt; defined. Even before you get your hands dirty, you'll know exactly the path that you're expected to follow, and how fast you're expected to follow it. &lt;em&gt;Up or out&lt;/em&gt;, that's a good summary of the approach. If you're not being promoted, then you should probably get out. Naturally, most people stay 2 years and leave. In any case - a structured and defined environment is where many are able to learn at their best.&lt;/p&gt;
&lt;p&gt;What I miss the most is by far the &lt;em&gt;people&lt;/em&gt;. I know - I sound just like a propaganda boy. But there's nothing I enjoy more than being the dumbest in the room. Most people in these places are driven, motivated, and hard-working. You'll meet the smartest people you will probably ever work with. Like in the military, you'll make long lasting bonds over traumatic experiences (e.g., long nights, tough clients). You'll build a network of great people - with whom you'll stay connected for the rest of your career. I certainly did.&lt;/p&gt;
&lt;p&gt;But it's not all rainbows and unicorns.&lt;/p&gt;
&lt;p&gt;Let's start with the thing most people fear. The hours. Yes, the hours are long. I generally don't dislike long hours. With an important premise: I &lt;em&gt;have&lt;/em&gt; to be doing something I &lt;em&gt;really&lt;/em&gt; enjoy. But in these firms, the hours are not long because they &lt;em&gt;need&lt;/em&gt; to be. The hours are long because you want to show your client (and your firm) that you are &lt;em&gt;always&lt;/em&gt; available. Sure, this means working longer hours - but less intense hours. For some, this will look like an absolute waste of time - why the hell are we here? It's not about doing the required work. It's about &lt;em&gt;being there&lt;/em&gt;. Or worse, it's about &lt;em&gt;showing&lt;/em&gt; that you were there.&lt;/p&gt;
&lt;p&gt;Soon, you'll realize you don't really have &lt;em&gt;choices&lt;/em&gt;. You &lt;em&gt;don't&lt;/em&gt; get to decide what projects you work on. You &lt;em&gt;don't&lt;/em&gt; get to decide what business area you specialize in. You &lt;em&gt;don't&lt;/em&gt; get to decide when and where you work. You don't get to decide what dinners, events, and gatherings you go to (a common strategy to work a bit less). If you have no clue what you want to do - this is great. You just need to go along. If you know what you want to do - this becomes &lt;em&gt;problematic&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Which brings me to my last point. Management consulting, from my experience, is a mix of a University and Military Service. You are &lt;em&gt;not&lt;/em&gt; rewarded for thinking differently. You are &lt;em&gt;not&lt;/em&gt; rewarded for exploring things others have not. Your innovative way of solving a problem is a risk, rather than an opportunity. For those who just want to do work, this is great - you don't really have to think about it. For the creatives, that want to spend time in things they like, and want to solve problems in crazy ways - this can be painful.&lt;/p&gt;
&lt;p&gt;Is management consulting for you? I have no clue.&lt;/p&gt;
&lt;p&gt;My experience was positive. Until it wasn't. It can be great step for those of us who have absolutely no idea what they want to do after university. It's an opportunity to see a lot of different things - with a smart bunch of people. For those who know exactly what they want - it might feel like a slow down. For most of us, it's a mix.&lt;/p&gt;
&lt;p&gt;In any case. At least now you know what you're getting yourself into.&lt;/p&gt;
&lt;p&gt;I sure wasn't. But I can't say I regret it.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;Disclaimer: This post is highly biased towards my experience. As it should - this is MY blog after all.&lt;/em&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Changelog neural search</title><link href="https://duarteocarmo.com/blog/changelog-neural-search-superduperdb.html" rel="alternate"/><published>2023-10-06T05:00:00+02:00</published><updated>2023-10-06T05:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-10-06:/blog/changelog-neural-search-superduperdb.html</id><summary type="html">&lt;p&gt;Search is one of the most important breakthroughs of the internet. Some are saying a list of blue links is not enough - and that AI &lt;em&gt;will&lt;/em&gt; overthrow search. I don't know if we're about to witness a revolution. But as with most things - there's only one way to know - to …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Search is one of the most important breakthroughs of the internet. Some are saying a list of blue links is not enough - and that AI &lt;em&gt;will&lt;/em&gt; overthrow search. I don't know if we're about to witness a revolution. But as with most things - there's only one way to know - to build and use it &lt;em&gt;myself&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;I like podcasts a lot. There's really nothing like hearing people talk about things I know &lt;em&gt;nothing&lt;/em&gt; about. Some of my favorite podcasts are produced by the &lt;a href="https://changelog.com/"&gt;Changelog&lt;/a&gt; network. More than once, I've had to use their &lt;a href="https://changelog.com/search?q=embedding"&gt;search engine&lt;/a&gt; when researching something that was said during an episode.&lt;/p&gt;
&lt;p&gt;One of the best things about the Changelog is that the &lt;a href="https://github.com/thechangelog"&gt;whole thing is open source&lt;/a&gt;. From the podcast engine itself, to the &lt;a href="https://github.com/thechangelog/transcripts"&gt;transcripts of every episode&lt;/a&gt;. Why not take all of these transcripts and build an AI-powered™ search engine around them?&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;a href="https://changelog.duarteocarmo.com"&gt;
&lt;img src="https://duarteocarmo.com/images/56/search.png" alt="Neural search for the changelog" style="max-width:100%;margin-bottom:-1em;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;figcaption&gt;&lt;a target="_blank" href="https://changelog.duarteocarmo.com"&gt;changelog.duarteocarmo.com&lt;/a&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="how-its-built"&gt;How it's built&lt;/h2&gt;
&lt;p&gt;Before I describe the stack, let's get the obvious out of the way: the whole thing is open source. Both the &lt;a href="https://github.com/duarteocarmo/thechangelogbot-backend"&gt;back-end&lt;/a&gt; and the &lt;a href="https://github.com/duarteocarmo/thechangelogbot-frontend"&gt;front-end&lt;/a&gt;. If you prefer to go and poke around the code yourself, be my guest.&lt;/p&gt;
&lt;p&gt;I love the chunk-embed-search-retrieve dance as much as the next guy, but for this one, I wanted to keep things a bit simpler, so I'm letting &lt;a href="https://www.superduperdb.com/"&gt;SuperDuperDB&lt;/a&gt; do most of the heavy lifting for me. With it, all I really need to do is add the embedding model to my serverless MongoDB instance, and it handles the rest for me:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# add model to DB&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;VectorIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;index_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;indexing_listener&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# search the DB&lt;/span&gt;

&lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;$regex&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;podcast&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;practicalai&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;like&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;What are embeddings&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;index_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For the front-end, I finally took &lt;a href="https://nextjs.org/"&gt;NextJS&lt;/a&gt; for a spin. Love the productivity gains - especially when we're talking about developer experience. &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt; is absolutely killing the developer experience side of things. On the other side, I have no clue how most of the magic is working - and I'm not sure that's a good thing.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>A poor man's guide to fine-tuning Llama 2</title><link href="https://duarteocarmo.com/blog/fine-tune-llama-2-telegram.html" rel="alternate"/><published>2023-09-26T13:00:00+02:00</published><updated>2023-09-26T13:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-09-26:/blog/fine-tune-llama-2-telegram.html</id><summary type="html">&lt;p&gt;Last Friday, &lt;a href="https://orbit.dtu.dk/en/persons/konstantinos-kalogeropoulos"&gt;Kostas&lt;/a&gt; and I found ourselves discussing AI over beers again. He was telling me how he thinks everyone else is &lt;em&gt;decades&lt;/em&gt; away from OpenAI - and that they likely won't catch up soon. I disagreed. In &lt;em&gt;fact&lt;/em&gt;, I think open source is quickly catching up, and getting &lt;a href="https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard"&gt;closer&lt;/a&gt; by …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last Friday, &lt;a href="https://orbit.dtu.dk/en/persons/konstantinos-kalogeropoulos"&gt;Kostas&lt;/a&gt; and I found ourselves discussing AI over beers again. He was telling me how he thinks everyone else is &lt;em&gt;decades&lt;/em&gt; away from OpenAI - and that they likely won't catch up soon. I disagreed. In &lt;em&gt;fact&lt;/em&gt;, I think open source is quickly catching up, and getting &lt;a href="https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard"&gt;closer&lt;/a&gt; by the day.&lt;/p&gt;
&lt;p&gt;Around four months ago, I &lt;a href="/blog/fine-tune-flan-t5-telegram.html"&gt;wrote a small tutorial&lt;/a&gt; on fine-tuning Flan T5 to generate conversations in my friends group chat. It worked &lt;em&gt;ok(ish)&lt;/em&gt;, but the process was clunky and took way too long.&lt;/p&gt;
&lt;p&gt;So, here we are, four months later, with the same question: How easy is it to train an LLM on my friend's group chat? Turns out, for a couple of dollars and one hour you can get pretty far.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;br&gt;&lt;/p&gt;
&lt;video  style="max-width:100%;border-radius: 2px" autoplay loop muted playsinline&gt;
  &lt;source src="https://duarteocarmo.com/images/55/llama-tiger-optimized.mp4" type="video/mp4" &gt;
&lt;/video&gt;
&lt;figcaption&gt;&lt;em&gt;Tiger-llama in action, simulating a Telegram conversation&lt;/em&gt;&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="data"&gt;Data&lt;/h2&gt;
&lt;p&gt;The goal is clear: fine-tune &lt;a href="https://huggingface.co/meta-llama/Llama-2-7b-hf"&gt;Llama 2&lt;/a&gt;, to automatically generate conversations that would normally happen in the group chat that I've held with my close friends for years now. The first step is to export the data from Telegram (which is pretty &lt;a href="https://telegram.org/blog/export-and-more"&gt;easy&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;For most LLMs, you need to massage the data into a specific format before training. This means providing things like the &lt;code&gt;instruction&lt;/code&gt;, the &lt;code&gt;input&lt;/code&gt;, the &lt;code&gt;system_prompt&lt;/code&gt;, etc. But all I wanted was to generate conversations, so I started by creating a large &lt;code&gt;jsonl&lt;/code&gt; file with chunks of conversations separated by a &lt;code&gt;###&lt;/code&gt; token to indicate the speaker.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# dataset.jsonl&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;### Friend 1: This is a question### Friend 2: This is a reply### Friend 3: What the hell are you guys talking about?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;### Friend 4: Who&amp;#39;s coming tonight?### Friend 2: No one, it&amp;#39;s literally Monday.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;#...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After getting the data ready, I went ahead and pushed it to the hugging face hub:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# huggingface-cli login --token hf_XXXXXXXX (to log into Hugging Face)&lt;/span&gt;

&lt;span class="c1"&gt;# read and split&lt;/span&gt;

&lt;span class="n"&gt;tiger_llama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;  &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;dataset.jsonl&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;train_tiger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_tiger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tiger_llama&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# push to hub&lt;/span&gt;

&lt;span class="n"&gt;train_tiger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pandas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_tiger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_tiger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pandas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_tiger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DatasetDict&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;train&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_tiger&lt;/span&gt;
&lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;test&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;test_tiger&lt;/span&gt;

&lt;span class="n"&gt;dataset_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;duarteocarmo/tiger-llama&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;ds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_to_hub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataset_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;branch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;main&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;private&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With all the conversations pushed to the hub (&lt;em&gt;cough&lt;/em&gt; - privacy - &lt;em&gt;cough&lt;/em&gt;), it was now time to train this model.&lt;/p&gt;
&lt;h2 id="training-with-axolotl"&gt;Training with axolotl&lt;/h2&gt;
&lt;p&gt;For the first couple of days, I got a bit frustrated. I tested a bunch of &lt;a href="https://huggingface.co/blog/llama2#fine-tuning-with-peft"&gt;different&lt;/a&gt; &lt;a href="https://github.com/brevdev/notebooks/blob/main/llama2-finetune.ipynb"&gt;tutorials&lt;/a&gt; on how to fine tune LLama-2. But none of them really got me anywhere. Some took way too long to train, others resulted in a model that didn't really generate anything interesting. I confess I &lt;a href="https://platform.openai.com/docs/guides/fine-tuning"&gt;almost fell to the dark side&lt;/a&gt;! But then I stumbled upon &lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl"&gt;axolotl&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl"&gt;Axolotl&lt;/a&gt; is designed to "&lt;em&gt;streamline the fine-tuning of LLMs&lt;/em&gt;". It supports a bunch of different models and training configurations. The best part? To fine-tune a model, all you is pretty much a config file. Yes, you heard that right - &lt;em&gt;just&lt;/em&gt; a &lt;code&gt;yaml&lt;/code&gt; file. The only things I needed to tweak were the base model (in our case, Llama 2 - &lt;code&gt;meta-llama/Llama-2-7b-hf&lt;/code&gt;), and the dataset sections:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# llama-tiger.yaml&lt;/span&gt;

&lt;span class="c1"&gt;# base model&lt;/span&gt;

&lt;span class="nt"&gt;base_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;meta-llama/Llama-2-7b-hf&lt;/span&gt;
&lt;span class="nt"&gt;base_model_config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;meta-llama/Llama-2-7b-hf&lt;/span&gt;
&lt;span class="nt"&gt;model_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;LlamaForCausalLM&lt;/span&gt;
&lt;span class="nt"&gt;tokenizer_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;LlamaTokenizer&lt;/span&gt;
&lt;span class="nt"&gt;is_llama_derived_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;

&lt;span class="c1"&gt;# ...&lt;/span&gt;

&lt;span class="c1"&gt;# my dataset&lt;/span&gt;

&lt;span class="c1"&gt;# other formats: https://github.com/OpenAccess-AI-Collective/axolotl#dataset&lt;/span&gt;

&lt;span class="nt"&gt;datasets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;duarteocarmo/tiger-llama&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;completion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;text&lt;/span&gt;
&lt;span class="nt"&gt;dataset_prepared_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;last_run_prepared&lt;/span&gt;
&lt;span class="nt"&gt;hub_model_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;duarteocarmo/tiger-llama&lt;/span&gt;
&lt;span class="nt"&gt;val_set_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.01&lt;/span&gt;
&lt;span class="nt"&gt;output_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;./qlora-out&lt;/span&gt;

&lt;span class="c1"&gt;# wandb monitoring&lt;/span&gt;

&lt;span class="nt"&gt;wandb_project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;llama-tiger&amp;quot;&lt;/span&gt;
&lt;span class="nt"&gt;wandb_log_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;checkpoint&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;### ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;details&gt;
  &lt;summary&gt;Expand for the full &lt;code&gt;tiger-llama.yaml&lt;/code&gt; file&lt;/summary&gt;


&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Image: winglian/axolotl:main-py3.10-cu118-2.0.1&lt;/span&gt;

&lt;span class="nt"&gt;base_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;meta-llama/Llama-2-7b-hf&lt;/span&gt;
&lt;span class="nt"&gt;base_model_config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;meta-llama/Llama-2-7b-hf&lt;/span&gt;
&lt;span class="nt"&gt;model_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;LlamaForCausalLM&lt;/span&gt;
&lt;span class="nt"&gt;tokenizer_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;LlamaTokenizer&lt;/span&gt;
&lt;span class="nt"&gt;is_llama_derived_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;

&lt;span class="nt"&gt;load_in_8bit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;load_in_4bit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="nt"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;

&lt;span class="nt"&gt;datasets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;duarteocarmo/tiger-llama&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;completion&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;text&lt;/span&gt;
&lt;span class="nt"&gt;dataset_prepared_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;last_run_prepared&lt;/span&gt;
&lt;span class="nt"&gt;hub_model_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;duarteocarmo/tiger-llama&lt;/span&gt;
&lt;span class="nt"&gt;val_set_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.01&lt;/span&gt;
&lt;span class="nt"&gt;output_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;./qlora-out&lt;/span&gt;

&lt;span class="nt"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;qlora&lt;/span&gt;
&lt;span class="nt"&gt;lora_model_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="nt"&gt;sequence_len&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;4096&lt;/span&gt;
&lt;span class="nt"&gt;sample_packing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="nt"&gt;pad_to_sequence_len&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;

&lt;span class="nt"&gt;lora_r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;32&lt;/span&gt;
&lt;span class="nt"&gt;lora_alpha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;16&lt;/span&gt;
&lt;span class="nt"&gt;lora_dropout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.05&lt;/span&gt;
&lt;span class="nt"&gt;lora_target_modules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;lora_target_linear&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="nt"&gt;lora_fan_in_fan_out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="nt"&gt;wandb_project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;llama-tiger&amp;quot;&lt;/span&gt;
&lt;span class="nt"&gt;wandb_entity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;wandb_watch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;wandb_run_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;wandb_log_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;checkpoint&amp;quot;&lt;/span&gt;

&lt;span class="nt"&gt;gradient_accumulation_steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;4&lt;/span&gt;
&lt;span class="nt"&gt;micro_batch_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;2&lt;/span&gt;
&lt;span class="nt"&gt;num_epochs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;3&lt;/span&gt;
&lt;span class="nt"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;paged_adamw_32bit&lt;/span&gt;
&lt;span class="nt"&gt;lr_scheduler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;cosine&lt;/span&gt;
&lt;span class="nt"&gt;learning_rate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.0002&lt;/span&gt;

&lt;span class="nt"&gt;train_on_inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;group_by_length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;bf16&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="nt"&gt;fp16&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;
&lt;span class="nt"&gt;tf32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;false&lt;/span&gt;

&lt;span class="nt"&gt;gradient_checkpointing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="nt"&gt;early_stopping_patience&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;resume_from_checkpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;local_rank&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;logging_steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;1&lt;/span&gt;
&lt;span class="nt"&gt;xformers_attention&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;flash_attention&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;

&lt;span class="nt"&gt;warmup_steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;10&lt;/span&gt;
&lt;span class="nt"&gt;eval_steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;20&lt;/span&gt;
&lt;span class="nt"&gt;eval_table_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;5&lt;/span&gt;
&lt;span class="nt"&gt;save_steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;deepspeed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;weight_decay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.0&lt;/span&gt;
&lt;span class="nt"&gt;fsdp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;fsdp_config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;special_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;bos_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;lt;s&amp;gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;eos_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;lt;/s&amp;gt;&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;unk_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;lt;unk&amp;gt;&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;/details&gt;

&lt;p&gt;Unfortunately, with these things, I needed a GPU to run the training. Given that I'm (&lt;em&gt;apparently&lt;/em&gt;) considered &lt;a href="https://www.businessinsider.com/gpu-rich-vs-gpu-poor-tech-companies-in-each-group-2023-8?r=US&amp;amp;IR=T"&gt;GPU poor&lt;/a&gt;, I used &lt;a href="https://vast.ai/"&gt;Vast.ai&lt;/a&gt;. I got a machine with at least ~40GB of GPU RAM, and that was relatively close to me. These cost around 1 USD/hour. I also used the Axololt docker image (e.g., &lt;code&gt;winglian/axolotl:main-py3.10-cu118-2.0.1&lt;/code&gt;) so that everything was pre-installed when the machine turned on.&lt;/p&gt;
&lt;p&gt;Once the machine was up, I ssh'd into it and ran:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;$&lt;span class="w"&gt; &lt;/span&gt;huggingface-cli&lt;span class="w"&gt; &lt;/span&gt;login&lt;span class="w"&gt; &lt;/span&gt;--token&lt;span class="w"&gt; &lt;/span&gt;hf_MY_HUGGINGFACE_TOKEN_WITH_WRITE_ACCESS
$&lt;span class="w"&gt; &lt;/span&gt;wandb&lt;span class="w"&gt; &lt;/span&gt;login&lt;span class="w"&gt; &lt;/span&gt;MY_WANDB_API_KEY
$&lt;span class="w"&gt; &lt;/span&gt;accelerate&lt;span class="w"&gt; &lt;/span&gt;launch&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;axolotl.cli.train&lt;span class="w"&gt; &lt;/span&gt;llama-tiger.yaml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And we were off to the races. With a single command, I was fine-tuning Llama 2 on my custom dataset. While training, Axolotl automatically logs everything to Weights &amp;amp; Biases, so we can monitor how the losses are evolving. As a bonus, it also shows the model outputs so that I can follow how to model is improving its generation during training:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/55/monitoring.png" alt="wandb monitoring dashboard" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;My dataset has ~12K rows. Training took around 1 hour in total, using a machine with 2xA40 GPUs. In summary: &lt;strong&gt;I spent around 2 USD to fine-tune the whole model&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Once training was done, your fine-tuned model (or &lt;a href="https://huggingface.co/docs/transformers/main/peft"&gt;adapter&lt;/a&gt;, to be specific) is saved in the &lt;code&gt;./lora-out&lt;/code&gt; directory. With my configuration, it was also uploaded the model to the hugging face repository I provided in &lt;code&gt;hub_model_id&lt;/code&gt;. Onto the inference.&lt;/p&gt;
&lt;h2 id="inference"&gt;Inference&lt;/h2&gt;
&lt;p&gt;The fine-tuning result is not an actual Llama 2 model, but an &lt;em&gt;adapter&lt;/em&gt; to the model (Axolotl uses &lt;a href="https://github.com/artidoro/qlora"&gt;qlora&lt;/a&gt; by default for Llama models). So in the end, the adapter is a mere 320 MB.&lt;/p&gt;
&lt;p&gt;Using Axolotl, inference is also pretty straightforward: All I need to do is download the model, and launch the Axolotl inference command:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# download from fine tuned repo&lt;/span&gt;

git&lt;span class="w"&gt; &lt;/span&gt;lfs&lt;span class="w"&gt; &lt;/span&gt;install
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://huggingface.co/duarteocarmo/tiger-llama

&lt;span class="c1"&gt;# run axolotl inference&lt;/span&gt;

accelerate&lt;span class="w"&gt; &lt;/span&gt;launch&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;axolotl.cli.inference&lt;span class="w"&gt; &lt;/span&gt;tiger-llama.yaml&lt;span class="w"&gt; &lt;/span&gt;--lora_model_dir&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;./tiger-llama&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Once downloaded and launched, I can give the model the start of a fake conversation, and it will go on to generate a completely fake conversation based on my group chat:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Input:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;### SZ: Quem vem a Lisboa no natal?&lt;/span&gt;
Output:&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;### SZ: Quem vem a Lisboa no natal?### PK: Acho que tenho tive uma surpresa de aniversario da malta, não tenho férias até janeiro### SZ: Fodass, faltamos 2 ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;Compared to the &lt;a href="/blog/fine-tune-flan-t5-telegram.html"&gt;last&lt;/a&gt; time I fine-tuned a model, open source is definitely moving fast. The process was not only much faster, and simpler than fine-tuning Flan T5 using a notebook, but the results were also much better than anything I had seen so far.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl"&gt;Axolotl&lt;/a&gt; did almost all the heavy lifting. Making the whole process super smooth. All I needed was a dataset, a config file, 2 USD, and about an hour to fine-tune a model. We've come a &lt;em&gt;long&lt;/em&gt; way.&lt;/p&gt;
&lt;p&gt;The model is still not perfect though. It captures some of my friends' quirks and ways of speaking, and the generated conversations make sense around ~70% of the time. But that's &lt;em&gt;still&lt;/em&gt; 30% nonsense. Could be a couple of things: the size (I used the "smaller" 7 billion version of the model), or even the language (Portuguese from Portugal). For example, when I prompt the model to simulate a politics discussion between my friends, someone starts discussing something about &lt;a href="https://da.wikipedia.org/wiki/Dilma_Rousseff"&gt;Dilma&lt;/a&gt; (which is &lt;em&gt;wildly&lt;/em&gt; inaccurate given we're from Portugal).&lt;/p&gt;
&lt;p&gt;So Kostas was right, &lt;a href="https://opensourceconnections.com/blog/2023/07/19/is-llama-2-open-source-no-and-perhaps-we-need-a-new-definition-of-open/"&gt;"open source"&lt;/a&gt; hasn't caught up yet. But oh boy, we're getting damn close.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>aicoverlettercreator.com</title><link href="https://duarteocarmo.com/blog/ai-cover-letter-creator-django-ai.html" rel="alternate"/><published>2023-07-19T13:00:00+02:00</published><updated>2023-07-19T13:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-07-19:/blog/ai-cover-letter-creator-django-ai.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://aicoverlettercreator.com?ref=blog_post" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/54/cover.png" alt="aicoverlettercreator.com" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Get rid of all applications that don't have a cover letter&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I remember it like it was yesterday. As I was leaving one of the first companies I've ever worked for, my manager asked me to hire my replacement. Drowning in hundreds of applications for the position, I still recall …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://aicoverlettercreator.com?ref=blog_post" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/54/cover.png" alt="aicoverlettercreator.com" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Get rid of all applications that don't have a cover letter&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I remember it like it was yesterday. As I was leaving one of the first companies I've ever worked for, my manager asked me to hire my replacement. Drowning in hundreds of applications for the position, I still recall his exact words.&lt;/p&gt;
&lt;p&gt;I don't want to start a debate on if you should include a cover letter in your job application or not. I've heard good arguments on both sides. There's one thing I'd like to point out though: in the days of ChatGPT, it's hard to get a good excuse &lt;em&gt;not&lt;/em&gt; to write one.&lt;/p&gt;
&lt;p&gt;Enter: &lt;a href="https://aicoverlettercreator.com?ref=blog_post" target="_blank"&gt;aicoverlettercreator.com&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="why-i-built-it"&gt;Why I built it&lt;/h2&gt;
&lt;p&gt;The short answer: &lt;em&gt;Why not?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The long answer, for a couple of reasons.&lt;/p&gt;
&lt;p&gt;I like mentoring/helping people that are just entering the industry (or just getting out of University) regarding their careers. Most people, when applying for positions, don't really pay close attention to the cover letter. I believe it can be a really big differentiator. Even though I hope they are aware of just how much ChatGPT can help them, I integrated in this version a little bit of &lt;em&gt;secret juice&lt;/em&gt;. I think that secret juice will make those cover letters even better. And that's the first reason, to help people.&lt;/p&gt;
&lt;p&gt;The second one is a bit more selfish. I've built a handful of LLM powered applications, but never anything completely end-to-end. They say the best way to learn about something is to build it yourself right? I certainly &lt;a href="/blog/nftuga-nft-experimentation.html"&gt;think so&lt;/a&gt;. So I wanted to build something where I made all the choices. Or something where I felt all the pain.&lt;/p&gt;
&lt;h2 id="the-stack"&gt;The stack&lt;/h2&gt;
&lt;p&gt;I'm not the biggest fan of falling in-love with a stack. "What stack would you use?" Well, tell me about the problem first.&lt;/p&gt;
&lt;p&gt;For this app, I went with &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; and &lt;a href="https://htmx.org/"&gt;htmx&lt;/a&gt; (&lt;a href="/blog/infrequent.html"&gt;again&lt;/a&gt;). Django is probably overkill for most &lt;em&gt;simple&lt;/em&gt; web applications, but when users, databases, and settings start becoming a thing, it includes almost all the batteries I need. As for htmx, it's not that I don't like React, it's that I love keeping things simple. That &lt;a href="https://htmx.org/essays/when-to-use-hypermedia/#hypermedia-not-a-good-fit-if"&gt;&lt;em&gt;does not&lt;/em&gt;&lt;/a&gt; mean it's always a great fit. But for this small project, it sure was.&lt;/p&gt;
&lt;p&gt;One of the biggest hurdles when building this, was the implementation of streaming from OpenAI's API. There's nothing I hate more than clicking a button and having to wait 10 seconds for something to appear on screen. Turns out, integrating &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events"&gt;server-sent events&lt;/a&gt; with Django is not so straightforward. And even though Django supports &lt;a href="https://docs.djangoproject.com/en/4.2/ref/request-response/#streaminghttpresponse-objects"&gt;streaming responses&lt;/a&gt;. At the end of the day, you can't really escape some good old Javascript. And I'm ok with that.&lt;/p&gt;
&lt;p&gt;Still sticking to my premise of keeping things as simple as possible. Every component I add to an app during development is a component I'll have to maintain during its lifetime. And I'm not the biggest fan of maintenance work. Because of this, &lt;a href="https://aicoverlettercreator.com"&gt;aicoverlettercreator.com&lt;/a&gt; is both simple to use &lt;em&gt;and&lt;/em&gt; to operate. It's a simple Django app with no &lt;a href="https://docs.celeryq.dev/en/stable/django/first-steps-with-django.html"&gt;queues&lt;/a&gt; and no self hosted PostgreSQL containers. It's a Dockerfile that connects to a &lt;a href="https://planetscale.com/"&gt;PlanetScale&lt;/a&gt; database. That's it, every git commit automatically updates the app.&lt;/p&gt;
&lt;p&gt;Nothing like keeping things simple.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Supercharging my Telegram group with the help of ChatGPT</title><link href="https://duarteocarmo.com/blog/supercharging-telegram-bot-chatgpt-python.html" rel="alternate"/><published>2023-06-10T02:00:00+02:00</published><updated>2023-06-10T02:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-06-10:/blog/supercharging-telegram-bot-chatgpt-python.html</id><summary type="html">&lt;p&gt;While most people in Europe use WhatsApp, my group of friends and I use Telegram. For years now we've used things like &lt;a href="https://combot.org/"&gt;Combot&lt;/a&gt; to see who's more silent than usual, and &lt;a href="https://missrose.org/"&gt;MissRose&lt;/a&gt; to give our monthly elected group admins moderation rights. Yeah, we take friendship that seriously.&lt;/p&gt;
&lt;p&gt;But now we …&lt;/p&gt;</summary><content type="html">&lt;p&gt;While most people in Europe use WhatsApp, my group of friends and I use Telegram. For years now we've used things like &lt;a href="https://combot.org/"&gt;Combot&lt;/a&gt; to see who's more silent than usual, and &lt;a href="https://missrose.org/"&gt;MissRose&lt;/a&gt; to give our monthly elected group admins moderation rights. Yeah, we take friendship that seriously.&lt;/p&gt;
&lt;p&gt;But now we have access to LLMs right? So I decided to build two new features for our group chat. One is &lt;em&gt;pretty&lt;/em&gt; useful, and the other.. Well, you be the judge.&lt;/p&gt;
&lt;h2 id="summarize-the-conversation-with-resume"&gt;Summarize the conversation with &lt;code&gt;/resume&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Our group chat can get pretty active sometimes. We call those the &lt;em&gt;golden hours&lt;/em&gt;. If you're distracted and miss one of those, you'll quickly lose track of what's going on. But let's be honest, sometimes, we just don't have time to catch up.&lt;/p&gt;
&lt;p&gt;Enter the &lt;code&gt;/resume&lt;/code&gt; command. Now, when someone misses a specially hectic part of the conversation, they can just use the /resume command to get a short summary of what happened.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/53/resume_command.png" alt="ChatGPT resume command in Telegram" style="max-width:50%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;In Portuguese "resume" means "summarize"&lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;To build it I used &lt;a href="https://python.langchain.com/en/latest/index.html"&gt;LangChain&lt;/a&gt; and the cheaper &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; API (e.g., ChatGPT). I keep a rotating list of the last 50 messages that happened in our group. When the command is called, I send those to OpenAI to get a summary back.&lt;/p&gt;
&lt;p&gt;Here's the core part of that code:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/bot/summarizer/main.py&lt;/span&gt;

&lt;span class="n"&gt;LLM_CHAT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;list_of_messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Fetches the summary for the last 50 messages&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;formatted_list_of_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;list_of_messages&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;formatted_list_of_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;truncate_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;formatted_list_of_message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;system_template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;You are an assistant helping friends catch up in a busy chat group. Your goal is to help friends in this group stay up to date without having to read all the messages.&lt;/span&gt;

&lt;span class="s2"&gt;You will receive a recent conversation that happened in the group. Respond immediately with a short and concise summary of the conversation.&lt;/span&gt;
&lt;span class="s2"&gt;The summary should have the following characteristics:&lt;/span&gt;
&lt;span class="s2"&gt;- Should be in Portuguese&lt;/span&gt;
&lt;span class="s2"&gt;- Should have a tone that is similar to the conversation, act like you are part of the group&lt;/span&gt;
&lt;span class="s2"&gt;- Use 3 sentences or less&lt;/span&gt;
&lt;span class="s2"&gt;- Don&amp;#39;t be too general, mention who said what&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;human_template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;CONVERSATION BLOCK START&lt;/span&gt;
&lt;span class="si"&gt;{list_of_messages}&lt;/span&gt;
&lt;span class="s2"&gt;CONVERSATION BLOCK END&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;system_message_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SystemMessagePromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;system_template&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;human_message_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HumanMessagePromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;human_template&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chat_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;system_message_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;human_message_prompt&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chat_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LLMChain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;LLM_CHAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chat_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chat_chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;list_of_messages&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;formatted_list_of_message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Response:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Most of my friends really liked the &lt;code&gt;/resume&lt;/code&gt; command. Some however, showed concerns regarding AI and how these things are super scary. After those comments, one thing was obvious: I needed to build something even funnier.&lt;/p&gt;
&lt;h2 id="impersonate-a-user-with-fake-username"&gt;Impersonate a user with &lt;code&gt;/fake @username&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;What if the bot could impersonate any of my friends in the group chat? &lt;em&gt;What if&lt;/em&gt; I could ask the bot to answer a question just like the person X would?&lt;/p&gt;
&lt;p&gt;Enter the &lt;code&gt;/fake @username &amp;lt;insert question&amp;gt;&lt;/code&gt; command. With it, you can impersonate anyone on the group chat (cough, my friends), and ask it to answer just like that person would! Here's the command in action:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/53/impersonate_command.png" alt="ChatGPT impersonate command in Telegram" style="max-width:50%;border-radius: 2px"&gt;&lt;/p&gt;
&lt;figcaption&gt;The `/fake` command in action &lt;/figcaption&gt;
&lt;p&gt;&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Although not as useful as the summarization command, it's actually a bit more complex to build.&lt;/p&gt;
&lt;p&gt;The first component is a vector database. Here, I'm storing the embeddings for pretty much everything my friends said in the past year. I wanted something simple like sqlite so I went with &lt;a href="https://www.trychroma.com/"&gt;Chroma&lt;/a&gt;. The trick here is not to embed every single message separately, but to build a long string of every single thing a person said. Once that's built,  you then chunk it and store it appropriately. With Chroma I could also store metadata about the document - which I used to store the author of the chunks.&lt;/p&gt;
&lt;p&gt;Once the whole vector database is built, I can now retrieve the N pieces of text that are most similar to a particular question (while filtering those results for a particular user):&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/bot/replier/main.py&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Collection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Gets the most relevant n_results items from a person for a given query&lt;/span&gt;
&lt;span class="sd"&gt;    and returns them as a context&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query_texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;from&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;documents&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="ne"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;No documents found&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now that I could get the most related items to a certain query for a particular user, I could start putting the impersonation together. Here, the answering chain function is in charge of trying to impersonate a user to the best of its ability given a question and relevant context:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/bot/replier/main.py&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_answering_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;LLMChain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Builds a answering chain for the impersonation&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;Your name is &lt;/span&gt;&lt;span class="si"&gt;{person}&lt;/span&gt;&lt;span class="s2"&gt;. You are participating in a group chat with all of your childhood friends.&lt;/span&gt;
&lt;span class="s2"&gt;- You will be given some context of messages from &lt;/span&gt;&lt;span class="si"&gt;{person}&lt;/span&gt;
&lt;span class="s2"&gt;- Use the context to inspire your answer to the user question&lt;/span&gt;
&lt;span class="s2"&gt;- Use the same tone of voice and writing style as the messages in the context&lt;/span&gt;
&lt;span class="s2"&gt;- If the answer is not in the context, make something funny up&lt;/span&gt;
&lt;span class="s2"&gt;- Always answer from the perspective of being &lt;/span&gt;&lt;span class="si"&gt;{person}&lt;/span&gt;
&lt;span class="s2"&gt;- Remember to be funny and entertaining&lt;/span&gt;
&lt;span class="s2"&gt;- This is a group chat and everything is fun and entertaining, so insults are allowed and fun&lt;/span&gt;
&lt;span class="s2"&gt;----------------&lt;/span&gt;
&lt;span class="si"&gt;{context}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class="n"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;system_message_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SystemMessagePromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;system_prompt&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;human_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{question}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;human_message_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;HumanMessagePromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;human_prompt&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chat_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;system_message_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;human_message_prompt&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;LLMChain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chat_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With that, we can pull everything together with the &lt;code&gt;reply_to_question_as&lt;/code&gt; function. It builds the chain, queries Chroma for relevant context, and then runs it:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/bot/replier/main.py&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reply_to_question_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Collection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Replies to a question as a user (e.g., impersonates)&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_answering_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;query_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# using the callback to track cost&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;get_openai_callback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Answer from &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;OpenAI callback: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;My friends really liked this one as well, and everyone cracked a laugh. But it was pretty obvious that the impersonation was not fooling anyone. It lacked &lt;em&gt;juice&lt;/em&gt;, one of my friends said.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;Both of these two bots were pretty fun to build, and &lt;a href="https://python-telegram-bot.org/"&gt;&lt;code&gt;python-telegram-bot&lt;/code&gt;&lt;/a&gt; makes building the whole thing so easy. It's basically a python script running &lt;a href="http://localhost:8000/blog/down-from-the-cloud-self-hosting.html"&gt;on my server&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The summarization feature was an instant success. It's simple, straightforward, and my friends loved it. The feedback to the impersonation feature was.. a bit of a mixed bag. Even though the model can accurately respond to a question with some relevant items to the person its impersonating, it's not really credible. It's missing the &lt;em&gt;juice&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;What is the &lt;em&gt;juice&lt;/em&gt;, you ask, my dear reader? The &lt;em&gt;juice&lt;/em&gt; is the voice of the person it's trying to impersonate. The &lt;em&gt;juice&lt;/em&gt; is the reason why when you ask ChatGPT to design something it looks pretty ugly. The &lt;em&gt;juice&lt;/em&gt; is the creativity, the originality!&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(and of course, privacy concerns, here's my mention of them)&lt;/em&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Fine-tuning FLAN-T5 to replace my friends</title><link href="https://duarteocarmo.com/blog/fine-tune-flan-t5-telegram.html" rel="alternate"/><published>2023-05-24T16:15:00+02:00</published><updated>2023-05-24T16:15:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-05-24:/blog/fine-tune-flan-t5-telegram.html</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/duarteocarmo/fine-tune-flant5/blob/master/notebooks/t5_train.ipynb"&gt;View Code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;They say that the best way to learn about something is to build it yourself. Everyone talks about OpenAI this, and OpenAI that. How about we fine tune a Large Language Model ourselves?&lt;/p&gt;
&lt;p&gt;If you heard about this models before, but are still curious about how all of …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;a href="https://github.com/duarteocarmo/fine-tune-flant5/blob/master/notebooks/t5_train.ipynb"&gt;View Code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;They say that the best way to learn about something is to build it yourself. Everyone talks about OpenAI this, and OpenAI that. How about we fine tune a Large Language Model ourselves?&lt;/p&gt;
&lt;p&gt;If you heard about this models before, but are still curious about how all of these things work underneath, this blog post is just for you. With it, you'll learn how to fine tune a Large Language Model (Google's &lt;a href="https://huggingface.co/google/flan-t5-base"&gt;FLAN-T5&lt;/a&gt;) on the conversation history of a group chat you have lying around. The goal is to teach the model to talk exactly like your friends would. It won't be perfect, of course, but the goal here is to learn how these things work.&lt;/p&gt;
&lt;p&gt;Here's how we'll structure things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="#1-why-flan-t5"&gt;Why FLAN-T5&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-setting-up-the-environment"&gt;Setting up your environment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#3-data-preprocessing"&gt;Preparing the data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#4-fine-tuning-the-model"&gt;Fine-tuning the model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#5-generating-conversations"&gt;Generating conversations&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Even though I love taking credit for ideas that are not mine, this blog post was inspired by the work of some other awesome folks. Particularly &lt;a href="https://www.philschmid.de/fine-tune-flan-t5"&gt;this&lt;/a&gt; one, and &lt;a href="https://www.izzy.co/blogs/robo-boys.html"&gt;this&lt;/a&gt; one.&lt;/p&gt;
&lt;p&gt;Let's get started.&lt;/p&gt;
&lt;h2 id="1-why-flan-t5"&gt;1. Why FLAN-T5&lt;/h2&gt;
&lt;p&gt;There are a lot of Large Language Models (e.g., LLMs) out there. With the explosion of things like GPT-3 and GPT-4, lots of companies and open source organizations have started building their own LLMs. Some of these models &lt;a href="https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse"&gt;leaked&lt;/a&gt;, others are &lt;a href="https://openai.com/"&gt;powerful&lt;/a&gt; but only available through an API.&lt;/p&gt;
&lt;p&gt;Some of theses LLMs that are actually free an open source, even for comercial use. Eugene Yan, has compiled a great &lt;a href="https://github.com/eugeneyan/open-llms"&gt;repo&lt;/a&gt; with an outline of some of these options.&lt;/p&gt;
&lt;p&gt;In 2020, Google released a paper called &lt;em&gt;&lt;a href="https://arxiv.org/pdf/1910.10683.pdf_"&gt;"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"&lt;/a&gt;&lt;/em&gt;, where they presented their T5 model. T5 is a encoder-decoder model that was trained in a variety of tasks (e.g., translate to german, summarize the following sentence, etc). &lt;a href="https://huggingface.co/docs/transformers/model_doc/flan-t5"&gt;FLAN-T5&lt;/a&gt; is basically the exact same thing as T5, but pretty much &lt;a href="https://huggingface.co/google/flan-t5-small#tldr"&gt;better at everything it does&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="FLAN-T5" src="https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67" /&gt;&lt;/p&gt;
&lt;p&gt;There are a couple of other reasons why we're using Flan-T5 for this guide:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It's free, open source, and commercially available&lt;/li&gt;
&lt;li&gt;It has several &lt;a href="https://huggingface.co/docs/transformers/model_doc/flan-t5#overview"&gt;sizes&lt;/a&gt; we can use (from small, all the way to xxl)&lt;/li&gt;
&lt;li&gt;It's compatible with the whole &lt;a href="https://huggingface.co"&gt;Hugging Face&lt;/a&gt; 🤗 ecosystem, making our life easier&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With that. Let's get things started.&lt;/p&gt;
&lt;h2 id="2-setting-up-the-environment"&gt;2. Setting up the environment&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Note: This tutorial was run on a NVIDIA A100 with 40GB of RAM&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If you want to run this project but don't have a powerful GPU at hand, you can get started quickly using &lt;a href="https://docs.unweave.io/docs/getting-started"&gt;Unweave&lt;/a&gt;. After signing up and installing, you can launch this project in two steps:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;#&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;unweave&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;account&lt;/span&gt;

&lt;span class="sx"&gt;!unweave link your-username/fine-tune-llm&lt;/span&gt;

#&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;launch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;machine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;powered&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;A100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPU&lt;/span&gt;

&lt;span class="sx"&gt;!unweave code --new --type a100 --image pytorch/pytorch:2.0.0-cuda11.7-cudnn8-devel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Also, to save and load models, you'll need to have a &lt;a href="https://huggingface.co"&gt;Hugging Face&lt;/a&gt; account. Once you have that set up, create and copy a new token. Paste it somewhere, we'll need it for later.&lt;/p&gt;
&lt;p&gt;The following cell, will install all required python libraries, as well as some local packages we'll need. (e.g., git, git-lfs)&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="sx"&gt;!pip install pytesseract evaluate tqdm transformers datasets rouge-score accelerate nltk tensorboard jupyter-black py7zr --upgrade&lt;/span&gt;
&lt;span class="sx"&gt;!apt-get install git --yes&lt;/span&gt;
&lt;span class="sx"&gt;!apt-get install git-lfs --yes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Take your Hugging Face token, and replace it with in the field below. This will log you into the Hugging Face hub, which we'll need to push and pull models.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="sx"&gt;!huggingface-cli login --token XXXXXXXXXXXXXXXX&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="3-data-preprocessing"&gt;3. Data preprocessing&lt;/h2&gt;
&lt;p&gt;Let' start by preprocessing our data for training. We'll start by defining some variables. Feel free to replace the location of your &lt;a href="https://www.maketecheasier.com/export-telegram-chat-history/"&gt;Telegram group chat export&lt;/a&gt; in the &lt;code&gt;TELEGRAM_EXPORT&lt;/code&gt; variable:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jupyter_black&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dataset&lt;/span&gt;

&lt;span class="n"&gt;jupyter_black&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;TELEGRAM_EXPORT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;anonym_telegram.json&amp;quot;&lt;/span&gt;  &lt;span class="c1"&gt;# anonymized for obvious reasons&lt;/span&gt;
&lt;span class="n"&gt;CONVERSATION_LIMIT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20_000&lt;/span&gt;  &lt;span class="c1"&gt;# limit number of messages&lt;/span&gt;
&lt;span class="n"&gt;TEST_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;  &lt;span class="c1"&gt;# % of test data&lt;/span&gt;
&lt;span class="n"&gt;IS_NEW_SESSION_CUTOFF_MINS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="mi"&gt;120&lt;/span&gt;  &lt;span class="c1"&gt;# if the time between messages is more than this, it&amp;#39;s a new session&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's preprocess this into a dataframe:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# load data in&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TELEGRAM_EXPORT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;r&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;messages&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# create a dataframe&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;from&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="c1"&gt;# filter empty messages&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;from&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# convert to datetime&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# sort by date&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CONVERSATION_LIMIT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# limit number of messages&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, the goal is to get our data into a format where the model can understand the conversation, and respond to what has been going on. Something like:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Person X: Where were you yesterday?\nPerson Y: I was at home! You?&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Person X: Me too, but thought of going out.&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This will allow the model to see a thread and respond in the most realistic manner possible to the conversation that was already going on.&lt;/p&gt;
&lt;p&gt;However, we also know that group chats are pretty async, so we don't necessarily want a "good morning" message to be a direct response to whatever happened last night.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# telegram exports have some artifacts, let&amp;#39;s clean them up&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;

&lt;span class="c1"&gt;# clean up and rename columns&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;telegram&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;from&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;sender&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;message_date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create new sessions&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;last_event&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;message_date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;is_new_session&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;message_date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;last_event&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;IS_NEW_SESSION_CUTOFF_MINS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;IS_NEW_SESSION_CUTOFF_MINS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat_session_id&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;message_date&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;is_new_session&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cumsum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now that we have everything setup, it's time to create a &lt;code&gt;conversation&lt;/code&gt; column, that has all the messages before a certain response:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;sess_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;records&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;cstring&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat_session_id&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;chat_session_id&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sender&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;response&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;cstring&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cstring&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;sender&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sess_dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;response&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;cstring&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
            &lt;span class="n"&gt;cstring&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;
    &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cstring&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;counter&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="c1"&gt;# create the conversation column&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;

&lt;span class="c1"&gt;# create the response column&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Your dataframe shape is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;You have the following columns: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;, &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Your dataframe shape is (20000, 8)&lt;/span&gt;

&lt;span class="c1"&gt;# You have the following columns: response, sender, message_date, chat, last_event, is_new_session, chat_session_id, conversation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's see what a single example looks like:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;314&lt;/span&gt;&lt;span class="p"&gt;)[[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;sender&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;records&amp;quot;&lt;/span&gt;
&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Conversation:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;conversation&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Response:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;response&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Lembram se do meu amigo alex frances?&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Claro ya&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Ele correu a maratona de paris outra vez..&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: E…&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Ou é só isso?&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Numero 304 overall...&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Crl&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Eia cum crlh&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Pa doente&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: 3’46. Que louco fds&lt;/span&gt;

&lt;span class="c1"&gt;#&lt;/span&gt;

&lt;span class="c1"&gt;# Response:&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Doente completo&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Our final preprocessing step is to load our dataframe in the &lt;a href="https://huggingface.co/docs/datasets/index"&gt;Datasets&lt;/a&gt; format:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;cols_for_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cols_for_dataset&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dataset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pandas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# DatasetDict({&lt;/span&gt;

&lt;span class="c1"&gt;#     train: Dataset({&lt;/span&gt;

&lt;span class="c1"&gt;#         features: [&amp;#39;conversation&amp;#39;, &amp;#39;response&amp;#39;, &amp;#39;__index_level_0__&amp;#39;],&lt;/span&gt;

&lt;span class="c1"&gt;#         num_rows: 16000&lt;/span&gt;

&lt;span class="c1"&gt;#     })&lt;/span&gt;

&lt;span class="c1"&gt;#     test: Dataset({&lt;/span&gt;

&lt;span class="c1"&gt;#         features: [&amp;#39;conversation&amp;#39;, &amp;#39;response&amp;#39;, &amp;#39;__index_level_0__&amp;#39;],&lt;/span&gt;

&lt;span class="c1"&gt;#         num_rows: 4000&lt;/span&gt;

&lt;span class="c1"&gt;#     })&lt;/span&gt;

&lt;span class="c1"&gt;# })&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Great. We're ready to focus on the model.&lt;/p&gt;
&lt;h2 id="4-fine-tuning-the-model"&gt;4. Fine-tuning the model&lt;/h2&gt;
&lt;p&gt;We start with some training specific imports, mostly related to Hugging Face.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoModelForSeq2SeqLM&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;concatenate_datasets&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;evaluate&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;nltk&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sent_tokenize&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataCollatorForSeq2Seq&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;huggingface_hub&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HfFolder&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;randrange&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Seq2SeqTrainer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Seq2SeqTrainingArguments&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, we define the most important variables for training, make sure to read the description of each one:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;chat&amp;quot;&lt;/span&gt;  &lt;span class="c1"&gt;# the name of your model&lt;/span&gt;
&lt;span class="n"&gt;MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;google/flan-t5-small&amp;quot;&lt;/span&gt;  &lt;span class="c1"&gt;# the id of the base model we will train (can be small, base, large, xl, etc.) (the bigger - the more GPU memory you need)&lt;/span&gt;
&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;  &lt;span class="c1"&gt;# the id of your huggingface repository where the model will be stored&lt;/span&gt;
&lt;span class="n"&gt;NUM_TRAIN_EPOCHS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;  &lt;span class="c1"&gt;# number of epochs to train&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's load the model and the tokenizer with the help of &lt;code&gt;AutoModelForSeq2SeqLM&lt;/code&gt;:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForSeq2SeqLM&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Some samples in our dataset will likely be too long for our model. So they'll need to be truncated. In the cell below, we define the max conversation and response lenght. This will then help in the truncating process:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# source&lt;/span&gt;

&lt;span class="n"&gt;tokenized_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;concatenate_datasets&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;train&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;test&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;truncation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;batched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;max_source_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokenized_inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;

&lt;span class="c1"&gt;# target&lt;/span&gt;

&lt;span class="n"&gt;tokenized_targets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;concatenate_datasets&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;train&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;test&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;truncation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;batched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;max_target_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokenized_targets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Max source length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_source_length&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Max target length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_target_length&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now that we know the limit for the truncation, we'll preprocess our dataset to be fed into the model. During its original training, FLAN T5 used &lt;a href="https://github.com/google-research/FLAN/blob/main/flan/v2/flan_templates_branched.py"&gt;different templates&lt;/a&gt; for training on multiple tasks. Here, I decided to use the &lt;code&gt;Continue writing the following text&lt;/code&gt; template. I did not test all of them, so feel free to modify according to what suits your task best.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;preprocess_function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;max_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;template_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Continue writing the following text.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;template_start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="n"&gt;model_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_source_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;truncation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;text_target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_target_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;truncation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;padding&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;max_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pad_token_id&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;model_inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;labels&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model_inputs&lt;/span&gt;

&lt;span class="n"&gt;tokenized_dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;preprocess_function&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remove_columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;response&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Keys of tokenized dataset: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenized_dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;train&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's define some metrics for the evaluation of the model:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# metric&lt;/span&gt;

&lt;span class="n"&gt;metric&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;rouge&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;punkt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# postprocess text&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;postprocess_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;preds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;preds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;

&lt;span class="c1"&gt;# compute metrics&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_preds&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;eval_preds&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;preds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;decoded_preds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Replace -100 in the labels as we can&amp;#39;t decode them.&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pad_token_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;decoded_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Some simple post-processing&lt;/span&gt;
    &lt;span class="n"&gt;decoded_preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decoded_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;postprocess_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decoded_preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decoded_labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;decoded_preds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;references&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;decoded_labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_stemmer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="n"&gt;prediction_lens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count_nonzero&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pad_token_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;preds&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;gen_len&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction_lens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Our final data preparation step is to use &lt;code&gt;DataCollatorForSeq2Seq&lt;/code&gt; to handle the padding for inputs and labels:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;label_pad_token_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="n"&gt;data_collator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DataCollatorForSeq2Seq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label_pad_token_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;label_pad_token_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pad_to_multiple_of&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Finally, we can define the training arguments. Feel free to play around with these. Depending on the use case, results can vary!&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Define training args&lt;/span&gt;

&lt;span class="n"&gt;training_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Seq2SeqTrainingArguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;# training parameters&lt;/span&gt;
    &lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;per_device_train_batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;per_device_eval_batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;predict_with_generate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;fp16&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Overflows with fp16&lt;/span&gt;
    &lt;span class="n"&gt;learning_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;5e-5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_train_epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;NUM_TRAIN_EPOCHS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# logging &amp;amp; evaluation strategies&lt;/span&gt;
    &lt;span class="n"&gt;logging_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/logs&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;logging_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;steps&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;logging_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;evaluation_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;epoch&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;save_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;epoch&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;save_total_limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;load_best_model_at_end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# push to hub parameters&lt;/span&gt;
    &lt;span class="n"&gt;report_to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;tensorboard&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;push_to_hub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hub_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;every_save&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hub_model_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hub_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HfFolder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_token&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;disable_tqdm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create Trainer instance&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Seq2SeqTrainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;training_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;data_collator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_collator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenized_dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;train&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;eval_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenized_dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;test&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;compute_metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;compute_metrics&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, we can kick-off training! This can take ~20 mins if you use the small model, or about a couple of hours with the base model. Of course, this largely depends on how much data you have, what's your GPU, epochs, etc. Feel free to tweak the params above to your liking!&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# Start training&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Once training is done, we can now push things to the hub!&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_model_card&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_to_hub&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;push_to_hub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="5-generating-conversations"&gt;5. Generating conversations&lt;/h2&gt;
&lt;p&gt;Now, generation with these models can be quite a challenge. OpenAI for example, used &lt;a href="https://openai.com/research/instruction-following"&gt;RLHF&lt;/a&gt; to align these models, and reward them for generating nice outputs. Here's a great sketch that illustrates the importance of the different training steps:&lt;/p&gt;
&lt;p&gt;&lt;img alt="image" src="https://wompampsupport.azureedge.net/fetchimage?siteId=7575&amp;amp;v=2&amp;amp;jpgQuality=100&amp;amp;width=700&amp;amp;url=https%3A%2F%2Fi.kym-cdn.com%2Fentries%2Ficons%2Ffacebook%2F000%2F044%2F025%2Fshoggothhh_header.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;In our use case, there are some parameters we can leverage when generating text. In &lt;a href="https://huggingface.co/blog/how-to-generate"&gt;this article&lt;/a&gt;, Patrick Von Platen goes through the different techniques &amp;amp; methods we can use to control the output of these models.&lt;/p&gt;
&lt;p&gt;There are &lt;em&gt;quite a few&lt;/em&gt; parameters you can tweak:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Either you are using greedy or beam search&lt;/li&gt;
&lt;li&gt;Sampling&lt;/li&gt;
&lt;li&gt;Top-K sampling&lt;/li&gt;
&lt;li&gt;Top-P sampling&lt;/li&gt;
&lt;li&gt;Size of n-grams to not repeat&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I advise you to go through the article and learn a bit about each one of these!&lt;/p&gt;
&lt;p&gt;Context given, let's get to it:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's first load the model we pushed to the hub. (You can also just use the directory where your model is saved instead.)&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# load tokenizer and model&lt;/span&gt;

&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;flan-t5-small-chat/checkpoint-8000/&amp;quot;&lt;/span&gt;  &lt;span class="c1"&gt;# the name of your repository where the model was pushed&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForSeq2SeqLM&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REPOSITORY_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# put model on GPU&lt;/span&gt;

&lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;cuda:0&amp;quot;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;cpu&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To kick off the generation, we select a &lt;em&gt;random&lt;/em&gt; converstaion in our dataset:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;502&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;test&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;STARTING_TEXT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;conversation&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;STARTING_TEXT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: 12h EM PORTUGAL:&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: 2 votos por pessoa pfv.&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Nao é anonymous?&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Não. Tudo visivel&lt;/span&gt;

&lt;span class="c1"&gt;# Luís Rodrigues: Foi uma bela administração sim senhor.&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# remember to learn and tweak these params&lt;/span&gt;

&lt;span class="n"&gt;generation_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;max_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;no_repeat_ngram_size&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;do_sample&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;top_k&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;top_p&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;temperature&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;num_return_sequences&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;&amp;quot;repetition_penalty&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;encoded_conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;STARTING_TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;device&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;output_encoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoded_conversation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;generation_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;output_decoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_encoded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Response:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_decoded&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Response:&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Mas é uma bela&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Another fun thing to do, is to let the model generate conversations completely by himself.&lt;/p&gt;
&lt;p&gt;The idea here is to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Start with a real conversation (5 replies)&lt;/li&gt;
&lt;li&gt;Generate a response using our model&lt;/li&gt;
&lt;li&gt;Create a new exchange (4 real replies + 1 AI generated)&lt;/li&gt;
&lt;li&gt;Generate a response&lt;/li&gt;
&lt;li&gt;Create a new exchange (3 real replies + 2 AI generated)&lt;/li&gt;
&lt;li&gt;etc..&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Eventually, we'll end up with a bunch of AI generated conversations from our model! Here's the code to do that:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;NUMBER_OF_ROUNDS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUMBER_OF_ROUNDS&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;STARTING_TEXT&lt;/span&gt;

    &lt;span class="n"&gt;encoded_conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;device&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;output_encoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoded_conversation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;generation_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;output_decoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_encoded&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;output_decoded&lt;/span&gt;
    &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;  &lt;span class="c1"&gt;# remove first intervention&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;New conversation:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;----&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# New conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: 2 votos por pessoa pfv.&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Nao é anonymous?&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Não. Tudo visivel&lt;/span&gt;

&lt;span class="c1"&gt;# Luís Rodrigues: Foi uma bela administração sim senhor.&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahaha&lt;/span&gt;

&lt;span class="c1"&gt;# ----&lt;/span&gt;

&lt;span class="c1"&gt;# New conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Hugo Silva: Nao é anonymous?&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Não. Tudo visivel&lt;/span&gt;

&lt;span class="c1"&gt;# Luís Rodrigues: Foi uma bela administração sim senhor.&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahaha&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Pa no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# ----&lt;/span&gt;

&lt;span class="c1"&gt;# New conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Tiago Pereira: Não. Tudo visivel&lt;/span&gt;

&lt;span class="c1"&gt;# Luís Rodrigues: Foi uma bela administração sim senhor.&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahaha&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Pa no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: No estou a caralhar&lt;/span&gt;

&lt;span class="c1"&gt;# ----&lt;/span&gt;

&lt;span class="c1"&gt;# New conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Luís Rodrigues: Foi uma bela administração sim senhor.&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahaha&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Pa no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: No estou a caralhar&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Ainda no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# ----&lt;/span&gt;

&lt;span class="c1"&gt;# New conversation:&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: Obrigado a esta administração! Dinâmica, Pacífica, Estruturadora, Libertadora!&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: #XeJonnyLegislativas2026&lt;/span&gt;

&lt;span class="c1"&gt;# Leonardo Soares: (2026 right? 😅)&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: PORTUGAL HOJE&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahahaha vamos!!!!&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Ahaha&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Pa no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: No estou a caralhar&lt;/span&gt;

&lt;span class="c1"&gt;# André Ferreira: Ainda no estou?&lt;/span&gt;

&lt;span class="c1"&gt;# Raul Carvalho: Haha nvel queres meeses&lt;/span&gt;

&lt;span class="c1"&gt;# ----&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</content><category term="blog"/></entry><entry><title>Governo Sombra transcripts</title><link href="https://duarteocarmo.com/blog/governo-sombra-transcripts.html" rel="alternate"/><published>2023-04-19T16:30:00+02:00</published><updated>2023-04-19T16:30:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-04-19:/blog/governo-sombra-transcripts.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://governosombra.duarteocarmo.com" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/51/website.png" alt="governosombra.duarteocarmo.com" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;7 years. That's how long I've lived in Denmark for. I love it, but Portugal is still close to my heart. As an emigrant, it's always hard to stay connected to what's going on in Portugal. What are people talking about? What's in the news? What worries people? What is …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;a href="https://governosombra.duarteocarmo.com" target="_blank"&gt;
&lt;img src="https://duarteocarmo.com/images/51/website.png" alt="governosombra.duarteocarmo.com" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;7 years. That's how long I've lived in Denmark for. I love it, but Portugal is still close to my heart. As an emigrant, it's always hard to stay connected to what's going on in Portugal. What are people talking about? What's in the news? What worries people? What is everyone arguing about over morning coffee?&lt;/p&gt;
&lt;p&gt;One of the ways I like to stay in touch is by listening to &lt;em&gt;Governo Sombra&lt;/em&gt; (now cleverly called  &lt;em&gt;Program whose name we are legally prevented from saying&lt;/em&gt;, after changing networks). It's a weekly show where the 3 guests (+1 host) comment on Portuguese and World news. Besides being funny, I also love the fact that the 3 guests represent different parts of the political spectrum, so I can get a good idea about how most of the people are feeling.&lt;/p&gt;
&lt;p&gt;Inspired by &lt;a href="https://karpathy.ai/lexicap/"&gt;Lexicap&lt;/a&gt;, I decided to build a &lt;a href="https://governosombra.duarteocarmo.com"&gt;website&lt;/a&gt; with the transcripts for all of the episodes of the show. More than once I've listened to a particular part of an episode and wanted to share it with a friend. Now, &lt;a href="https://governosombra.duarteocarmo.com/episodes/171#170"&gt;I can do it&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For the transcription, I used OpenAI's Open Source &lt;a href="https://github.com/openai/whisper"&gt;Whisper&lt;/a&gt; model. With a  &lt;em&gt;small&lt;/em&gt; caveat: the whole thing (serving + transcribing) needed to run in my 20 EUR/month &lt;a href="/blog/down-from-the-cloud-self-hosting.html"&gt;VM&lt;/a&gt;. So it needed to be small &lt;em&gt;and&lt;/em&gt; efficient.&lt;/p&gt;
&lt;p&gt;I like Python, but Rust was the obvious choice. For the transcription part, I used &lt;a href="https://github.com/tazz4843/whisper-rs"&gt;whisper.rs&lt;/a&gt; (Rust bindings for &lt;a href="https://github.com/ggerganov/whisper.cpp/"&gt;whisper.cpp&lt;/a&gt;). For serving the app, I went with &lt;a href="https://actix.rs/"&gt;Actix Web&lt;/a&gt; - it's small, efficient, and reminds me a lot of Flask. Incredible how a small Linux box can handle transcribing 60min+ episodes without hiccuping much.&lt;/p&gt;
&lt;p&gt;The quality of the transcription is something like a 6/10. I did use the &lt;a href="https://github.com/openai/whisper#available-models-and-languages"&gt;base&lt;/a&gt; model so there is &lt;em&gt;clearly&lt;/em&gt; space for improvement. Maybe when I get a dedicated box.&lt;/p&gt;
&lt;p&gt;The entire thing is up on &lt;a href="https://github.com/duarteocarmo/governosombra"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>LLMs in production: lessons learned</title><link href="https://duarteocarmo.com/blog/llms-lessons-learned.html" rel="alternate"/><published>2023-03-26T20:35:00+02:00</published><updated>2023-03-26T20:35:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-03-26:/blog/llms-lessons-learned.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/50/openai_3.png" alt="OpenAI image 1" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;A couple of months ago nobody asked me about my work. &lt;em&gt;Something&lt;/em&gt; related to computers and AI. Fast forward to today, even my uncle asks me about ChatGPT. The &lt;a href="https://www.urbandictionary.com/define.php?term=hype"&gt;&lt;em&gt;hype&lt;/em&gt;&lt;/a&gt; is real. Only time will tell if the hype will materialize. But while the world wonders, work goes on.&lt;/p&gt;
&lt;p&gt;In …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/50/openai_3.png" alt="OpenAI image 1" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;A couple of months ago nobody asked me about my work. &lt;em&gt;Something&lt;/em&gt; related to computers and AI. Fast forward to today, even my uncle asks me about ChatGPT. The &lt;a href="https://www.urbandictionary.com/define.php?term=hype"&gt;&lt;em&gt;hype&lt;/em&gt;&lt;/a&gt; is real. Only time will tell if the hype will materialize. But while the world wonders, work goes on.&lt;/p&gt;
&lt;p&gt;In the last couple of months, I've helped develop a product that leverages this tech at its core. It was - to say the least - a learning experience. Full of lessons learned, full of little traumas and things I would've done better. In the hopes of helping someone out there, here are some lessons I've learned along the way.&lt;/p&gt;
&lt;h3 id="know-your-use-case"&gt;Know your use case&lt;/h3&gt;
&lt;p&gt;With so much hype surrounding LLMs, it's easy to think they'll solve all problems in Machine Learning. Or at least the ones related to NLP. From what I've seen, this is hardly the case. Let's split Machine Learning tasks into two types. (1) Predictive tasks, such as classifying the sentiment of a tweet, and (2) generative tasks, such as summarizing the content of an article.&lt;/p&gt;
&lt;p&gt;GPT is &lt;em&gt;great&lt;/em&gt; at generative tasks. Writing an email with context, writing a summary of a web page, and creating an article given some ideas. These are a very specific subset of Machine Learning. Most of the problems we face are predictive problems: what is the sentiment of this tweet? What is the class of this image? It's hard to tell exactly how good these models will become for the predictive use case. But before throwing GPT at whatever you're facing, think about the use case. This leads me to my next point.&lt;/p&gt;
&lt;h3 id="deterministic-vs-stochastic"&gt;Deterministic vs. stochastic&lt;/h3&gt;
&lt;p&gt;Do you know the classic "&lt;em&gt;I cannot reproduce this issue&lt;/em&gt;" we've said oh so many times? Well, welcome to a whole other level of that. "&lt;em&gt;The AI said something wrong&lt;/em&gt;" is a very common issue I've faced with these models. At the core, LLMs are stochastic, and not deterministic. Before LLMs, whatever model we were building, given an input, would always produce the same output. With LLMs, given the same input, the output is &lt;em&gt;rarely&lt;/em&gt; the same.&lt;/p&gt;
&lt;p&gt;This is amazing for generative tasks but can become a real problem for predictive tasks. The problem of reproducibility. To avoid this issue, you can play around with some of the parameters of this model such as the &lt;a href="https://lukesalamone.github.io/posts/what-is-temperature/"&gt;temperature&lt;/a&gt; or the presence penalty. This further reinforces the idea that these models are great for generating text - where the cost of failure is low. If you're predicting something with a high cost of failure - best beware.&lt;/p&gt;
&lt;h3 id="prepare-for-the-future"&gt;Prepare for the future&lt;/h3&gt;
&lt;p&gt;This tech is moving incredibly fast. Yes, even for &lt;em&gt;us&lt;/em&gt;, the group of people that &lt;em&gt;loves&lt;/em&gt; to move fast. OpenAI, is releasing models and updates at an astronomical pace. By the time you finished developing that shiny new product, there will probably be a new one. This happened to us actually. We started developing a system based on GPT-3 (e.g., &lt;code&gt;text-davinci-003&lt;/code&gt;).  Shortly after, OpenAI released the ChatGPT model (e.g., &lt;code&gt;gpt-3.5-turbo&lt;/code&gt;). Two days before we went to production - GPT-4 (e.g., &lt;code&gt;gpt-4&lt;/code&gt;) was here.&lt;/p&gt;
&lt;p&gt;These models use different APIs, structures, and behaviors. Thankfully, I'd spent a Friday afternoon implementing the &lt;a href="https://refactoring.guru/design-patterns/strategy/python/example"&gt;Strategy Pattern&lt;/a&gt;, so that we could support both &lt;a href="https://platform.openai.com/docs/guides/completion"&gt;Text Completion&lt;/a&gt; and &lt;a href="https://platform.openai.com/docs/guides/chat"&gt;Chat Completion&lt;/a&gt; for our system. Turned out to be time well spent, and adopting GPT-4 was a moderate one-line code change. Whenever a new model comes out, expect to be rate limited, and expect timeouts. Especially with these technologies, it's worth preparing for the future, and anticipating what &lt;em&gt;might&lt;/em&gt; be released. Without &lt;a href="/blog/simple-software.html"&gt;exaggerating&lt;/a&gt;, of course.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/50/openai_2.png" alt="OpenAI image 2" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h3 id="streaming-vs-batching"&gt;Streaming vs. Batching&lt;/h3&gt;
&lt;p&gt;Yes, it comes down to that, a single API call to OpenAI. (Well, at least while we don't have a solid open-source alternative). Given a large enough prompt, these models can take quite a while to return a result. There's nothing worse than making a user wait for 20-30 seconds before showing some action on the screen - especially when that request ends up failing sometimes. After some days of frustration, a co-worker sent me &lt;a href="https://github.com/openai/openai-cookbook/blob/main/examples/How_to_stream_completions.ipynb"&gt;this&lt;/a&gt; link.&lt;/p&gt;
&lt;p&gt;Instead of waiting for the whole completion to be finished, consider streaming the response. In short, it allows you to start receiving completion tokens as soon as they're generated. Together with FastAPI's excellent implementation of &lt;a href="https://fastapi.tiangolo.com/advanced/custom-response/#streamingresponse"&gt;&lt;code&gt;StreamingResponse&lt;/code&gt;&lt;/a&gt;, it allows you to show the completion happening in real-time to users. This pattern is widely used by implementations such as &lt;a href="https://www.notion.so/product/ai"&gt;Notion's&lt;/a&gt;. Now that all we have is an API, customer experience is even more differentiating.&lt;/p&gt;
&lt;h3 id="prompt-engineering-is-hard"&gt;Prompt engineering is hard&lt;/h3&gt;
&lt;p&gt;We have prompt engineers now. No wonder. I prefer the term prompt &lt;em&gt;artists&lt;/em&gt;. As soon as we had to put a system in place we struggled. We are used to fancy tools to track every parameter of an experiment. Now that we're designing prompts for a stochastic system the game has changed. How can know the impact of every little change in the prompt? How can I know how one instruction affects another instruction? What really defines a &lt;em&gt;good&lt;/em&gt; prompt? You see where I'm getting at. For now, it's an art - more than engineering.&lt;/p&gt;
&lt;p&gt;Very soon, our prompt was a list of 10-15 instructions giving very specific directions on what we needed to generate. To battle this, we created a set of scripts that would allow us to quickly compare the impact of small changes in the prompt. This is a costly exercise - given a stochastic system - since we have to call a paid API 2/3x to compare the effect of prompt changes. This allowed us to at least get a decent direction on how good the prompt is. Still, I feel like prompt engineering is still a lot of shooting in the dark.&lt;/p&gt;
&lt;h3 id="conclusion-not-the-hammer-you-were-looking-for"&gt;Conclusion: Not the hammer you were looking for&lt;/h3&gt;
&lt;p&gt;LLMs are incredibly powerful tools. From generating text to now even supporting images - it's hard to tell where the field will lead us. For creative and generative tasks they shine like nothing before in our field. Being able to have access to such a powerful system will be groundbreaking in a lot of fields of machine learning. But with great power comes great responsibility - or &lt;em&gt;baggage&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Putting an LLM into production is a challenging problem with a lot of unknowns. From depending on a simple API call, to not being able to reproduce results - it can get complicated.&lt;/p&gt;
&lt;p&gt;If the task is simple enough (e.g., classification) it's still hard to justify a more expensive, less explainable, and albeit slower system than &lt;em&gt;traditional&lt;/em&gt; methods. I don't think this is the hammer for all our problems. At least not &lt;em&gt;yet&lt;/em&gt;.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>parlabot - ask the Portuguese parliament</title><link href="https://duarteocarmo.com/blog/parlabot.html" rel="alternate"/><published>2023-02-27T18:55:00+01:00</published><updated>2023-02-27T18:55:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-02-27:/blog/parlabot.html</id><summary type="html">&lt;p&gt;Large language models (LLM) are really &lt;a href="https://www.theverge.com/2023/2/8/23590864/google-ai-chatbot-bard-mistake-error-exoplanet-demo"&gt;dumb&lt;/a&gt;. I mean, how can you fail when the question is as simple as "&lt;em&gt;What is 23 times 18&lt;/em&gt;"? Even though they're making most headlines, at the end of the day, these models are, predicting the next token based on the previous ones. If …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Large language models (LLM) are really &lt;a href="https://www.theverge.com/2023/2/8/23590864/google-ai-chatbot-bard-mistake-error-exoplanet-demo"&gt;dumb&lt;/a&gt;. I mean, how can you fail when the question is as simple as "&lt;em&gt;What is 23 times 18&lt;/em&gt;"? Even though they're making most headlines, at the end of the day, these models are, predicting the next token based on the previous ones. If we ask a super simple question without any context to an LLM, the performance will only depend on &lt;a href="https://arxiv.org/pdf/2202.07206.pdf"&gt;how much the model has seen that example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But LLMs are also amazing. Have you used &lt;a href="https://github.com/features/copilot"&gt;GitHub copilot&lt;/a&gt; lately? I don't trust it to write my functions for me, but damn it writes a good boilerplate. It's like auto-complete, but &lt;em&gt;better&lt;/em&gt;. It doesn't only know your imports. It knows where you are in the code base, and can suggest based on that.&lt;/p&gt;
&lt;p&gt;The difference? &lt;em&gt;Context&lt;/em&gt;. When using copilot, your code &lt;em&gt;is&lt;/em&gt; the context, and that's why copilot knows exactly what to suggest. So how can I give the right context to one of these models? To answer this, I built &lt;a href="https://parlabot.duarteocarmo.com/"&gt;parlabot&lt;/a&gt;. Parlabot (&lt;a href="https://www.collinsdictionary.com/dictionary/portuguese-english/parlamento"&gt;parlamento&lt;/a&gt; + bot) uses all transcripts from the Portuguese parliament to answer any question you might have about Portuguese politics. It also (tries) to do so in the most truthful way it can, with the information it has.&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="https://parlabot.duarteocarmo.com"&gt;
&lt;img src="https://duarteocarmo.com/images/49/parlabot-screenshot.png" alt="Parlabot website" style="max-width:100%;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There are two things at play when you ask a question to parlabot. Below is a sketch that might make things easier to follow.&lt;/p&gt;
&lt;p&gt;The first part is the search engine. To make this work, I scrapped all transcripts from the Portuguese parliament and used a multilingual embedding model to transform them into vectors. Whenever you type in a question, I embed it using the same exact model, transforming it into another vector. I then find the top k most similar reference vectors to the query vector. This is the same as finding the top K most relevant speech segments for that question.&lt;/p&gt;
&lt;p&gt;Once I have the most relevant speech segments, it's time to build the &lt;em&gt;prompt&lt;/em&gt;. The prompt, has two parts, the &lt;em&gt;context&lt;/em&gt;, and the &lt;em&gt;direction&lt;/em&gt;. The context, is simply the most relevant speech segments, the corresponding speakers, and the political parties. In the direction part of the prompt, I ask GPT3 to answer a given question using the segments from the context.&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="https://parlabot.duarteocarmo.com"&gt;
&lt;img src="https://duarteocarmo.com/images/49/LLM-sketch.png" alt="LLM Question/Answer system" style="max-width:100%;border-radius: 2px"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The results are pretty good! The bot is very capable of using the context given to answer most questions. Of course, the context is not always relevant to a question. For example, if you ask "How to bake a cake?", it's unlikely we'll find the answer in the data set of parliament transcriptions. With good prompt design, we can make the bot answer "&lt;em&gt;I don't know&lt;/em&gt;" when this is the case.&lt;/p&gt;
&lt;p&gt;The mix of LLMs, politics, and poorly written code is a great recipe for disaster. Although disaster does sound pretty fun,  I don't expect this bot to answer &lt;em&gt;truthfully&lt;/em&gt; any of the questions Portuguese tax-payers have about elected parties. These answers should be taken with a massive grain of salt and skepticism. LLMs are stochastic beasts that might predict two different outputs with the exact same set of inputs.&lt;/p&gt;
&lt;p&gt;Still, this system might be interesting to solve problems where the cost of failure is not catastrophic. Without forgetting, that a bit of skepticism is a great foundation for any ML system.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>An opinionated Python boilerplate</title><link href="https://duarteocarmo.com/blog/opinionated-python-boilerplate.html" rel="alternate"/><published>2023-02-18T00:00:00+01:00</published><updated>2023-02-18T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-02-18:/blog/opinionated-python-boilerplate.html</id><summary type="html">&lt;p&gt;There's nothing quite like starting a new project. A greenfield, &lt;em&gt;filled&lt;/em&gt; with possibilities. It's a privilege many don't come across. A lot of us, get thrown into projects with a lot of legacy code. But sometimes, we start from scratch.&lt;/p&gt;
&lt;p&gt;This is the time. The time to make all the …&lt;/p&gt;</summary><content type="html">&lt;p&gt;There's nothing quite like starting a new project. A greenfield, &lt;em&gt;filled&lt;/em&gt; with possibilities. It's a privilege many don't come across. A lot of us, get thrown into projects with a lot of legacy code. But sometimes, we start from scratch.&lt;/p&gt;
&lt;p&gt;This is the time. The time to make all the right calls. The time to use the right tools, the right abstractions, and the right structure. The unfortunate truth is, &lt;a href="/blog/simple-software.html"&gt;there is no &lt;em&gt;right&lt;/em&gt; way&lt;/a&gt;. There are just &lt;em&gt;ways.&lt;/em&gt; Most of us use the tools we're comfortable with. We've tried things in the past. Some worked, some didn't.&lt;/p&gt;
&lt;p&gt;I've started my fair share of Python projects. By failing, &lt;em&gt;a lot&lt;/em&gt;, I've converged to a set of tools. These will change over time. But in the hope to help a fellow Pythonista out there, let's talk about them.&lt;/p&gt;
&lt;h2 id="pip-tools-for-dependency-management"&gt;&lt;a href="https://pip-tools.readthedocs.io/en/latest/"&gt;Pip-tools&lt;/a&gt; for dependency management&lt;/h2&gt;
&lt;p&gt;Now I don't want to start a packaging war. If &lt;a href="https://python-poetry.org/"&gt;Poetry&lt;/a&gt; works for you, then by all means, go for it. I've tried most of the stuff out there, from the good old &lt;code&gt;pip freeze &amp;gt; requirements.txt&lt;/code&gt;, &lt;a href="https://pipenv.pypa.io/en/latest/"&gt;pipenv&lt;/a&gt;, all the way to Poetry. After many battles, I've stuck with &lt;a href="https://github.com/jazzband/pip-tools"&gt;pip-tools&lt;/a&gt;. Pip-tools strikes the right balance between simplicity, effectiveness, and speed. And yes, &lt;em&gt;speed&lt;/em&gt; matters. I don't want to wait a whole minute for my dependencies to compile.&lt;/p&gt;
&lt;p&gt;With recent pip updates, I can just specify my dependencies in a &lt;code&gt;pyproject.toml&lt;/code&gt; and install them with &lt;code&gt;pip install -e .&lt;/code&gt;. However, there are benefits to pinning your dependencies using something like &lt;a href="https://github.com/jazzband/pip-tools#pip-tools--pip-compile--pip-sync"&gt;pip-tools&lt;/a&gt;. Especially in Machine Learning environments. Yes Anaconda exists - but speed &lt;em&gt;is&lt;/em&gt; a requirement.&lt;/p&gt;
&lt;p&gt;The folks at &lt;a href="https://jazzband.co/"&gt;Jazzband&lt;/a&gt; have created an easy-to-use tool that has yet to let me down. For example, suppose I have a &lt;code&gt;dreambox&lt;/code&gt; project. Here's an example &lt;code&gt;pyproject.toml&lt;/code&gt; file:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;[build-system]&lt;/span&gt;
&lt;span class="n"&gt;requires&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hatchling&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;build-backend&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hatchling.build&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;[project]&lt;/span&gt;
&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;dreambox&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;42&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;dependencies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pandas&amp;gt;=1.5.3&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;numpy&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;fastapi&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;[project.optional-dependencies]&lt;/span&gt;
&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pytest&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To create a pinned and hashed &lt;code&gt;requirements&lt;/code&gt; file , all you need to do is:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;env&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;pip-compile&lt;span class="w"&gt; &lt;/span&gt;--generate-hashes&lt;span class="w"&gt; &lt;/span&gt;--output-file&lt;span class="o"&gt;=&lt;/span&gt;requirements.txt&lt;span class="w"&gt; &lt;/span&gt;pyproject.toml
&lt;span class="o"&gt;(&lt;/span&gt;env&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;pip-compile&lt;span class="w"&gt; &lt;/span&gt;--generate-hashes&lt;span class="w"&gt; &lt;/span&gt;--extra&lt;span class="o"&gt;=&lt;/span&gt;dev&lt;span class="w"&gt; &lt;/span&gt;--output-file&lt;span class="o"&gt;=&lt;/span&gt;requirements-dev.txt&lt;span class="w"&gt; &lt;/span&gt;pyproject.toml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That ensures whatever you build in production is reproducible in my own machine, to the exact dependency and hash.&lt;/p&gt;
&lt;h2 id="pyprojecttoml-for-configuration"&gt;&lt;a href="https://peps.python.org/pep-0621/"&gt;Pyproject.toml&lt;/a&gt; for configuration&lt;/h2&gt;
&lt;p&gt;Who likes configuration files? I certainly don't. There's nothing nice about having 15 configuration files at the root of my project. One for test coverage, one for linting, one for GitHub, one for formatting, and another one for CI. No thanks.&lt;/p&gt;
&lt;p&gt;Fortunately, &lt;a href="https://peps.python.org/pep-0621/"&gt;PEP 621&lt;/a&gt; happened. And with it, a (mostly) common way to store all of the metadata and configuration for a Python project. Using a &lt;em&gt;single&lt;/em&gt; &lt;code&gt;pyproject.toml&lt;/code&gt;, I can define my local package name and details, my pinned dependencies, my pytest coverage configuration, my formatting configuration, my... You get me. All the configurations, in a &lt;em&gt;single&lt;/em&gt; file. Expand the example below for an example.&lt;/p&gt;
&lt;details&gt;
  &lt;summary&gt;An example &lt;code&gt;pyproject.toml&lt;/code&gt;&lt;/summary&gt;


&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# packaging information&lt;/span&gt;

&lt;span class="k"&gt;[build-system]&lt;/span&gt;
&lt;span class="n"&gt;requires&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hatchling&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;build-backend&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;hatchling.build&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# project information&lt;/span&gt;

&lt;span class="k"&gt;[project]&lt;/span&gt;
&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;dreambox&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;23.1.26&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;readme&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;README.md&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;requires-python&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;gt;=3.10&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# requirements.txt generated from here&lt;/span&gt;

&lt;span class="n"&gt;dependencies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Jinja2&amp;gt;=3.1.2&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;loguru&amp;gt;=0.6.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;fastapi&amp;gt;=0.88.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;uvicorn&amp;gt;=0.20.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# requirements-dev.txt generated from here&lt;/span&gt;

&lt;span class="k"&gt;[project.optional-dependencies]&lt;/span&gt;
&lt;span class="n"&gt;dev&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;black&amp;gt;=22.10.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;isort&amp;gt;=5.10.1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pip-tools&amp;gt;=6.10.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pytest&amp;gt;=7.2.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;pytest-cov&amp;gt;=4.0.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# linting config&lt;/span&gt;

&lt;span class="k"&gt;[tool.ruff]&lt;/span&gt;
&lt;span class="n"&gt;ignore&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;E501&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# isort config&lt;/span&gt;

&lt;span class="k"&gt;[tool.isort]&lt;/span&gt;
&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;black&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;line_length&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;79&lt;/span&gt;
&lt;span class="n"&gt;skip&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.env/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;venv&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.venv&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# coverage config&lt;/span&gt;

&lt;span class="k"&gt;[tool.coverage.paths]&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;src&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;[tool.coverage.run]&lt;/span&gt;
&lt;span class="n"&gt;branch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;relative_files&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="k"&gt;[tool.coverage.report]&lt;/span&gt;
&lt;span class="n"&gt;show_missing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;fail_under&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;

&lt;span class="c1"&gt;# formatting config&lt;/span&gt;

&lt;span class="k"&gt;[tool.black]&lt;/span&gt;
&lt;span class="n"&gt;line-length&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;79&lt;/span&gt;
&lt;span class="n"&gt;extend-exclude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&amp;#39;&lt;/span&gt;
&lt;span class="s1"&gt;/(&lt;/span&gt;
&lt;span class="s1"&gt;  | .env&lt;/span&gt;
&lt;span class="s1"&gt;  | .venv&lt;/span&gt;
&lt;span class="s1"&gt;  | venv&lt;/span&gt;
&lt;span class="s1"&gt;  | notebooks&lt;/span&gt;
&lt;span class="s1"&gt;)/&lt;/span&gt;
&lt;span class="s1"&gt;&amp;#39;&amp;#39;&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;/details&gt;

&lt;h2 id="makefiles-for-common-sense"&gt;&lt;a href="https://makefiletutorial.com/#top"&gt;Makefiles&lt;/a&gt; for common sense&lt;/h2&gt;
&lt;p&gt;I clone a new repo. Now, where the hell do I start? How do I run the tests? How do I run the API? There should be a common standard for these things right? &lt;em&gt;Wrong&lt;/em&gt;. Every project is different. Every project has its pet peeves. I don't really like pet peeves. Anyone should be able to pick up my project and get up and running straight away. No meetings, no calls, just start working. A well-documented &lt;code&gt;README.MD&lt;/code&gt; might work. But you'll get lazy.&lt;/p&gt;
&lt;p&gt;So what is the easiest way to add a good project "map", without much work? Enter my Makefile. With a Makefile, I define all of the main project commands in a single file. For example:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c"&gt;## Install for production&lt;/span&gt;

&lt;span class="nf"&gt;install&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;--upgrade&lt;span class="w"&gt; &lt;/span&gt;pip
&lt;span class="w"&gt;    &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;-e&lt;span class="w"&gt; &lt;/span&gt;.

&lt;span class="c"&gt;## Install for development&lt;/span&gt;

&lt;span class="nf"&gt;install-dev&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;install&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;-e&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;.[dev]&amp;quot;&lt;/span&gt;

&lt;span class="c"&gt;## Build dependencies&lt;/span&gt;

&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;pip-compile&lt;span class="w"&gt; &lt;/span&gt;--resolver&lt;span class="o"&gt;=&lt;/span&gt;backtracking&lt;span class="w"&gt; &lt;/span&gt;--output-file&lt;span class="o"&gt;=&lt;/span&gt;requirements.txt&lt;span class="w"&gt; &lt;/span&gt;pyproject.toml
&lt;span class="w"&gt;    &lt;/span&gt;pip-compile&lt;span class="w"&gt; &lt;/span&gt;--resolver&lt;span class="o"&gt;=&lt;/span&gt;backtracking&lt;span class="w"&gt; &lt;/span&gt;--extra&lt;span class="o"&gt;=&lt;/span&gt;dev&lt;span class="w"&gt; &lt;/span&gt;--output-file&lt;span class="o"&gt;=&lt;/span&gt;requirements-dev.txt&lt;span class="w"&gt; &lt;/span&gt;pyproject.toml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If I want to install the dependencies on a project, all I do is &lt;code&gt;make install&lt;/code&gt;. Or &lt;code&gt;make install-dev&lt;/code&gt; for the dev dependencies as well. This is nice. But my &lt;em&gt;favorite&lt;/em&gt; feature is this:&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/48/make-help.png" alt="Make help command" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;
With the help of &lt;a href="https://gist.github.com/klmr/575726c7e05d8780505a"&gt;this&lt;/a&gt; small script at the end of my Makefile, I can generate a map for my project. Now, whenever someone new comes in, all they have to do is &lt;code&gt;make help&lt;/code&gt;, and they'll get a good idea of how to navigate things. The script takes the comment above every rule in the Makefile, so it's super easy to add new commands to it.&lt;/p&gt;
&lt;p&gt;Now isn't that nice?&lt;/p&gt;
&lt;h2 id="ruff-for-linting"&gt;&lt;a href="https://github.com/charliermarsh/ruff"&gt;Ruff&lt;/a&gt; for linting&lt;/h2&gt;
&lt;p&gt;I like Python, but I also like &lt;a href="/blog/on-rust.html"&gt;Rust&lt;/a&gt; a lot. &lt;a href="https://github.com/charliermarsh/ruff"&gt;Ruff&lt;/a&gt; is a new (and shiny) Python linter written in Rust. It's probably the most modern project in my boilerplate. It does what any linter should do, it should flag errors and bad practices. I like catching errors before they blow up in the user's face.&lt;/p&gt;
&lt;p&gt;Most importantly, Ruff is fast.  &lt;a href="https://www.reddit.com/r/ProgrammingLanguages/comments/v69shk/what_makes_languages_blazingly_fast/"&gt;&lt;em&gt;Blazingly&lt;/em&gt;&lt;/a&gt; fast one might say? Running &lt;code&gt;ruff&lt;/code&gt; in our project will probably take less than half a second. Also, like all other tools I mentioned here, Ruff supports &lt;code&gt;pyproject.toml&lt;/code&gt; for configuration. So you don't have to have to maintain &lt;em&gt;yet&lt;/em&gt; another file. (I'm looking at you &lt;a href="https://github.com/PyCQA/flake8/issues/234"&gt;flake8&lt;/a&gt;)&lt;/p&gt;
&lt;h2 id="black-isort-for-formatting"&gt;&lt;a href="https://black.readthedocs.io/en/stable/"&gt;Black&lt;/a&gt; &amp;amp; &lt;a href="https://pycqa.github.io/isort/"&gt;isort&lt;/a&gt; for formatting&lt;/h2&gt;
&lt;p&gt;I never got the whole tabs vs. spaces thing. But I do know programmers have opinions. So many opinions. A cool thing about working in software is that you can use software to &lt;em&gt;squash&lt;/em&gt; those opinions. We just agree on using a couple of tools to format our code, all other discussions become useless. So thanks, &lt;a href="https://lukasz.langa.pl/"&gt;Łukasz&lt;/a&gt;, for creating &lt;a href="https://github.com/psf/black"&gt;Black&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Most of my projects have a great &lt;code&gt;make format&lt;/code&gt; rule, that runs &lt;code&gt;black&lt;/code&gt; and &lt;code&gt;isort&lt;/code&gt;. The configuration? It's in my &lt;code&gt;pyproject.toml&lt;/code&gt; file above.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c"&gt;## Format files using black&lt;/span&gt;

&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;isort&lt;span class="w"&gt; &lt;/span&gt;.
&lt;span class="w"&gt;    &lt;/span&gt;black&lt;span class="w"&gt; &lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Note: Apparently ruff &lt;a href="https://twitter.com/charliermarsh/status/1597661264800813056"&gt;supports&lt;/a&gt; import sorting now? Wow. My boilerplate is already obsolete.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="pre-commit-hooks-i-dont"&gt;pre-commit hooks (I don't)&lt;/h2&gt;
&lt;p&gt;I've seen a lot of projects adopt &lt;a href="https://pre-commit.com/"&gt;pre-commit&lt;/a&gt; hooks as a way of enforcing standards for code. Whether it's to make sure tests pass or to ensure nothing gets raised by linters, pre-commit hooks are all the rage. The thing is, git is already &lt;em&gt;pretty&lt;/em&gt; complicated. I mostly work with data scientists, and most of them already face a steep curve in adopting software practices. From my (opinionated) experience, pre-commit hooks make things even more complicated.&lt;/p&gt;
&lt;p&gt;But yes, some standards should be enforced, otherwise, something will break for sure. But isn't that what CI is for? Enforcing something like formatting and linting is easy with the &lt;code&gt;--check&lt;/code&gt; flag:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c"&gt;## Run checks (ruff + test)&lt;/span&gt;

&lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;ruff&lt;span class="w"&gt; &lt;/span&gt;.
&lt;span class="w"&gt;    &lt;/span&gt;isort&lt;span class="w"&gt; &lt;/span&gt;--check&lt;span class="w"&gt; &lt;/span&gt;.
&lt;span class="w"&gt;    &lt;/span&gt;black&lt;span class="w"&gt; &lt;/span&gt;--check&lt;span class="w"&gt; &lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If I run &lt;code&gt;make check&lt;/code&gt; and something doesn't comply, I'll get an error. When opening a PR with a new feature, CI can take care of all of those checks:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;pull_request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nt"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;test&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;runs-on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ubuntu-latest&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;actions/checkout@v2&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Install dev requirements&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;make install-dev&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Check formatting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# &amp;lt;- if this does not pass&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;make check&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Run tests&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;# &amp;lt;- you don&amp;#39;t get here&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;run&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;make test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If something doesn't run, CI doesn't pass. No need to mess around with git in everyone's machine. Everyone is free to commit whatever they want - and will get notified if something is not working. That's good enough.&lt;/p&gt;
&lt;h2 id="an-example-project"&gt;An example project&lt;/h2&gt;
&lt;p&gt;At some of my previous &lt;a href="/talks"&gt;talks&lt;/a&gt;, people normally ask me to share my project template (e.g., my &lt;a href="https://cookiecutter.readthedocs.io/en/stable/"&gt;cookiecutter&lt;/a&gt;). For me, sharing project templates it's a bit like copying someone else's Vim config. As soon as you need to tweak something (and trust me, you will), you'll have a headache.&lt;/p&gt;
&lt;p&gt;This boilerplate is made of choices (or traumas) I've experienced over time. It's tailored exactly to &lt;em&gt;my&lt;/em&gt; needs (e.g., data science, machine learning, API design). I like &lt;a href="/blog/simple-software.html"&gt;simple&lt;/a&gt; things a lot.&lt;/p&gt;
&lt;p&gt;I do think it could serve some as a starting point for their own &lt;em&gt;opinionated&lt;/em&gt; boilerplate. So &lt;a href="https://github.com/duarteocarmo/boilerplate"&gt;here's&lt;/a&gt; the code, fellow Pythonista. Have fun out there.&lt;/p&gt;
&lt;hr&gt;

&lt;h4 id="updates-notes"&gt;Updates &amp;amp; notes&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;This post was featured in &lt;a href="https://www.youtube.com/live/Mh760W_M2ro?feature=share&amp;amp;t=1040"&gt;Episode #326&lt;/a&gt; of the PythonBytes podcast&lt;/li&gt;
&lt;/ul&gt;</content><category term="blog"/></entry><entry><title>On Rust</title><link href="https://duarteocarmo.com/blog/on-rust.html" rel="alternate"/><published>2023-01-02T00:00:00+01:00</published><updated>2023-01-02T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2023-01-02:/blog/on-rust.html</id><summary type="html">&lt;p&gt;It's that time of the year again. The family is getting together and celebrating. Grandma is cooking something amazing for dinner. I'm trying to solve &lt;a href="https://adventofcode.com/"&gt;Advent of Code&lt;/a&gt; puzzles. This year, I decided to do something &lt;em&gt;different&lt;/em&gt;. Instead of solving the puzzles in Python, I decided that I would try …&lt;/p&gt;</summary><content type="html">&lt;p&gt;It's that time of the year again. The family is getting together and celebrating. Grandma is cooking something amazing for dinner. I'm trying to solve &lt;a href="https://adventofcode.com/"&gt;Advent of Code&lt;/a&gt; puzzles. This year, I decided to do something &lt;em&gt;different&lt;/em&gt;. Instead of solving the puzzles in Python, I decided that I would try to solve them in &lt;a href="https://www.rust-lang.org/"&gt;Rust&lt;/a&gt;. Why?&lt;/p&gt;
&lt;p&gt;Rust has been getting a lot of attention lately. The language is in its &lt;em&gt;7th&lt;/em&gt; year as the &lt;a href="https://survey.stackoverflow.co/2022/#technology-most-loved-dreaded-and-wanted"&gt;most loved programming language&lt;/a&gt; according to the latest Stack Overflow developer survey. Why do people love it so much? So I decided to learn it. What better way to get frustrated during my Christmas break?&lt;/p&gt;
&lt;h2 id="things-i-like"&gt;Things I like&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Speed.&lt;/strong&gt; That's the first thing I noticed. I brute force &lt;em&gt;a lot&lt;/em&gt; of these AoC puzzles. Rust handled them without hiccups. It's a bit like the feeling of using Numpy if you're a Python dev. Having been one for some years, I never got much exposure to compiled languages like C++ or Java. I can't help but notice a pretty big difference in speed compared to Python or JavaScript.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Effectiveness.&lt;/strong&gt; Most puzzles I solved were right on the first try. This is rarely the case with Python. Rather than using types as more of a &lt;em&gt;decorative/ergonomic&lt;/em&gt; feature like in Python, in Rust, these types are strictly enforced at compile time. This made me think twice about the code I wrote. But also lead to more correct code. &lt;a href="https://rust-analyzer.github.io/"&gt;Rust-analyzer&lt;/a&gt;, Rust's LSP, was also a great experience. Giving timely, useful, and clear messages about what was wrong.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Syntax.&lt;/strong&gt; The syntax is not that different from Python. For loops are easy and readable with the &lt;code&gt;for x in y&lt;/code&gt; syntax, for example. &lt;a href="https://doc.rust-lang.org/beta/rust-by-example/fn/closures.html"&gt;Closures&lt;/a&gt; remind me a lot of lambda functions in Python. I can also see why Python adopted the &lt;a href="https://doc.rust-lang.org/rust-by-example/flow_control/match.html"&gt;match&lt;/a&gt; statement. In general, Rust was readable. Whenever code got complicated and repetitive, things like &lt;a href="https://doc.rust-lang.org/rust-by-example/macros.html"&gt;macros&lt;/a&gt; came to the rescue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ecosystem.&lt;/strong&gt; Compared to Pip, &lt;a href="https://doc.rust-lang.org/cargo/guide/index.html"&gt;Cargo&lt;/a&gt; is a breath of fresh air. Installing packages is as easy as copy-pasting into your &lt;code&gt;Cargo.toml&lt;/code&gt; file. In my experience, even larger packages such as &lt;a href="https://docs.rs/nalgebra/latest/nalgebra/"&gt;naglebra&lt;/a&gt; have been fast to install. To check the documentation of all packages installed locally, you can also use &lt;code&gt;cargo doc --open&lt;/code&gt;, which is great for offline development. There are also good resources online when you're feeling stuck, like the &lt;a href="https://doc.rust-lang.org/book/"&gt;Rust Programming Language Book&lt;/a&gt;, or the more practical &lt;a href="https://doc.rust-lang.org/stable/rust-by-example/"&gt;Rust by Example&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="things-i-dont-like-yet"&gt;Things I don't like (yet)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Efficiency.&lt;/strong&gt; Rust takes longer to write. Probably, due to my lack of experience. I get the impression programs need to be better thought out. You'll take longer to write code, but it will be more &lt;em&gt;effective&lt;/em&gt; code. This makes Rust great for production. However, it's hard to think of Rust as a good language to experiment, explore, or even iterate quickly. Experimentation is a great advantage of languages like Python or JavaScript. It's a trade-off.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Types.&lt;/strong&gt; I like types. I've been using types in most of my Python production applications. Not only that, but I believe I now understand better why Python doesn't &lt;em&gt;enforce&lt;/em&gt; types. Types help you write more &lt;em&gt;correct&lt;/em&gt; programs, but it's easy to get stuck in type hell. I don't know if I want a &lt;code&gt;usize&lt;/code&gt; or an &lt;code&gt;int32&lt;/code&gt;. I don't care if it's a &lt;code&gt;Matrix&lt;/code&gt; or a &lt;code&gt;DMatrix&lt;/code&gt;, or &lt;code&gt;f32&lt;/code&gt; or &lt;code&gt;f64&lt;/code&gt;. The Rust compiler does. When I'm focused on solving a problem, I'm not interested in solving a &lt;em&gt;type&lt;/em&gt; problem. Types are great. But if you get too strict about them, types can get in the way. I felt Rust types got in the way a lot of times.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ecosystem.&lt;/strong&gt; Rust's ecosystem is still young. Yes, AoC puzzles are very specific and don't represent the real world. But when looking for libraries, for example, it appears Rust is still trying to figure out where does lie. This is normal. There are not that many resources on the internet &lt;em&gt;yet&lt;/em&gt;. For a beginner programmer in any language, the quality of web resources is everything. When looking for answers and resources on Rust, it takes longer to find the right resources. When looking for libraries, it's tough to understand which libraries are the most popular ones, or the ones to go from. Even though things are developing &lt;a href="https://lib.rs/"&gt;quickly&lt;/a&gt;.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;dot_product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt; &lt;span class="nc"&gt;float&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;do_math_by_copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p1&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;d1&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;d2&lt;/span&gt;: &lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt; &lt;span class="kt"&gt;f32&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;d1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;d2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;dot_product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;do_math_by_borrow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p1&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;d1&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;d2&lt;/span&gt;: &lt;span class="kp"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;Vector3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;: &lt;span class="kt"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-&amp;gt; &lt;span class="kt"&gt;f32&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;d1&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;d2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;dot_product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Borrowing.&lt;/strong&gt; References and borrowing have been the hardest concepts to grasp for me. Again. This might be very well due to my lack of experience in compiled languages. Look at the example above, taken from &lt;a href="https://www.forrestthewoods.com/blog/should-small-rust-structs-be-passed-by-copy-or-by-borrow/"&gt;this&lt;/a&gt; great post. There is no obvious reason why you would go with &lt;code&gt;do_math_by_borrow&lt;/code&gt;. It's uglier, less readable, and to the best of my knowledge, &lt;em&gt;not&lt;/em&gt; significantly faster. This whole borrowing and ownership dance is probably necessary for Rust. But I felt like doing a lot of &lt;code&gt;.Clone()&lt;/code&gt; and &lt;code&gt;.Copy()&lt;/code&gt; to escape this problem.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;AoC was great to learn a new language! Perhaps next year something easier though. I didn't finish all the puzzles &lt;a href="https://github.com/duarteocarmo/advent2022/"&gt;by any means&lt;/a&gt;. But by day 10, I had a good grasp of the basics of Rust and could write simple programs without too much hassle. The problem with AoC is the steep curve in puzzle difficulty. By day 15, I was already spending one hour just to parse the input correctly.&lt;/p&gt;
&lt;p&gt;I can understand why Rust is loved. It's fast, effective, and it makes you a better programmer. It requires you to make deliberate choices on every variable you create, and every statement you write. This is a double-edged sword. On one side, it makes you write better and safer code. On the other side, it makes you write code slower, and experiments should be fast. It does look great for production. But it's &lt;a href="https://mdwdotla.medium.com/using-rust-at-a-startup-a-cautionary-tale-42ab823d9454"&gt;early&lt;/a&gt; still.&lt;/p&gt;
&lt;p&gt;I would love to try and write some Machine Learning APIs in Rust. Written in Rust, they might be safer and faster! But for that to happen, I would need to be able to load models using Rust. This means loading Scikit, Transformer, or even PyTorch models with Rust. I don't think we are there yet. But when we are, I would love to give it a go.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Scaling Machine Learning microservices</title><link href="https://semaphore.io/blog/machine-learning-microservice" rel="alternate"/><published>2022-12-18T00:00:00+01:00</published><updated>2022-12-18T00:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:semaphore.io,2022-12-18:/blog/machine-learning-microservice</id><content type="html"/><category term="blog"/></entry><entry><title>Monitoring Machine Learning APIs</title><link href="https://duarteocarmo.com/blog/monitoring-machine-learning-apis.html" rel="alternate"/><published>2022-12-06T10:30:00+01:00</published><updated>2022-12-06T10:30:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-12-06:/blog/monitoring-machine-learning-apis.html</id><summary type="html">&lt;p&gt;Last Friday I &lt;a href="/talks"&gt;presented&lt;/a&gt; at PyData Global. I talked about  &lt;em&gt;monitoring&lt;/em&gt;, but felt like there was more I should've said. So this article is just about that.&lt;/p&gt;
&lt;p&gt;Even though the number of models in the wild is growing, the field of monitoring is still green and full of unknowns.&lt;/p&gt;
&lt;p&gt;Almost …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last Friday I &lt;a href="/talks"&gt;presented&lt;/a&gt; at PyData Global. I talked about  &lt;em&gt;monitoring&lt;/em&gt;, but felt like there was more I should've said. So this article is just about that.&lt;/p&gt;
&lt;p&gt;Even though the number of models in the wild is growing, the field of monitoring is still green and full of unknowns.&lt;/p&gt;
&lt;p&gt;Almost all models I've put into customers' hands have had some sort of monitoring. In my head, I've divided ML monitoring in three main areas.&lt;/p&gt;
&lt;p&gt;Let's explore them.&lt;/p&gt;
&lt;h2 id="logs-error-monitoring"&gt;Logs &amp;amp; Error monitoring&lt;/h2&gt;
&lt;p&gt;This first area is responsible for monitoring three things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;Errors&lt;/em&gt;: Something in the application blew up, and needs to be checked (e.g., trace backs, etc.)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Logs&lt;/em&gt;: What's going on? When is it going on?&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Performance&lt;/em&gt;: How fast is the API? How much load can it take?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This is very similar to traditional software observability. And yes, most machine learning applications - are, in fact, &lt;em&gt;software&lt;/em&gt; applications. This means we have all the concerns of traditional software, and &lt;em&gt;then some&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;When it comes to logging, &lt;a href="https://github.com/Delgan/loguru"&gt;loguru&lt;/a&gt; has made life easier.  At the end of the day, who likes to configure loggers?&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/45/monitoring-sketch-1.png" alt="Logs/Error monitoring of ML Apps" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Another less known component is &lt;a href="https://opentelemetry.io/"&gt;OpenTelemetry&lt;/a&gt;. OpenTelemetry has slowly become the standard in observability. Think of it as a wrapper that allows you to capture useful statistics on the performance of your app. It also allows you to capture some custom events/data that are useful for debugging.&lt;/p&gt;
&lt;p&gt;Here's an example:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;fastapi&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;opentelemetry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;opentelemetry.exporter.otlp.proto.http.trace_exporter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;OTLPSpanExporter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;opentelemetry.instrumentation.fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPIInstrumentor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;opentelemetry.sdk.trace&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TracerProvider&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;opentelemetry.sdk.trace.export&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BatchSpanProcessor&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;

&lt;span class="c1"&gt;# set up tracing and open telemetry&lt;/span&gt;

&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;TracerProvider&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;processor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OTLPSpanExporter&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_span_processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_tracer_provider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_tracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vm"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# instrument FastAPI&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;demo&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;FastAPIInstrumentor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;instrument_app&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reponse_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# get the current span&lt;/span&gt;
    &lt;span class="n"&gt;current_span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_current_span&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# hash input&lt;/span&gt;
    &lt;span class="n"&gt;input_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# save hash to opentelemetry&lt;/span&gt;
    &lt;span class="n"&gt;current_span&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;app.demo.input_hash&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features_hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;- Saves attribute&lt;/span&gt;

    &lt;span class="c1"&gt;# return predictions&lt;/span&gt;
    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_prediction_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prediction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's walk through the example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We kick off manual instrumentation using the &lt;code&gt;TracerProvider&lt;/code&gt;(&lt;a href="https://opentelemetry.io/docs/instrumentation/python/manual/"&gt;docs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;We add automatic instrumentation with the FastAPIInstrumentator (&lt;a href="https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/fastapi/fastapi.html"&gt;docs&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;We have the ability to send custom events with the &lt;code&gt;current_span.set_attribute&lt;/code&gt; function (&lt;a href="https://opentelemetry.io/docs/instrumentation/python/manual/#add-attributes-to-a-span"&gt;docs&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you're using something like &lt;a href="https://www.datadoghq.com/"&gt;DataDog&lt;/a&gt;, &lt;a href="https://www.honeycomb.io/"&gt;Honeycomb&lt;/a&gt;, or any other provider that supports OpenTelemetry - you should get all this information on your service's dashboard. You'll get all the stats related to your application, and all the custom events you are sending in as well.&lt;/p&gt;
&lt;p&gt;This is particularly interesting to understand, for example, when your API is breaking, and &lt;em&gt;what&lt;/em&gt; is making it break!&lt;/p&gt;
&lt;h2 id="driftdegradation-monitoring"&gt;Drift/Degradation monitoring&lt;/h2&gt;
&lt;p&gt;The second area of monitoring is about model degradation in the face of new data. This usually happens as models in production start being outdated. Trying to make predictions on types of data your model has never seen before, usually leads to problems. This concept is known as &lt;em&gt;data drift&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/45/monitoring-sketch-2.png" alt="Drift monitoring" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;In a &lt;a href="/blog/monitoring-ml-models-fastapi-evidently.html"&gt;previous&lt;/a&gt; article, I talked extensively about this topic. I explained how to leverage FastAPI and Evidently to monitor drift, in real-time, for your ml application.&lt;a href="/blog/monitoring-ml-models-fastapi-evidently.html"&gt; So go read that if you're interested&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There is, however, one concept I should double down on. The fact of storing all inputs (e.g., requests) and outputs (e.g., predictions) of your model. Here's an example of doing so - without increasing response latency:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# ...&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reponse_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;&lt;span class="p"&gt;,):&lt;/span&gt;
    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_prediction_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# saves both inputs and outputs as json to our predictions database&lt;/span&gt;
    &lt;span class="c1"&gt;# runs in the background - doesn&amp;#39;t make anyone wait&lt;/span&gt;
    &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;save_to_database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;asdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prediction&lt;/span&gt;

&lt;span class="c1"&gt;# ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This &lt;em&gt;prediction sink&lt;/em&gt; (red in the image above), is key. As described in the previous article, it allows you to monitor drift. But it &lt;em&gt;also&lt;/em&gt; allows you to get insights into the third area of monitoring. Let's talk about that one.&lt;/p&gt;
&lt;h2 id="business-monitoringreporting"&gt;Business monitoring/reporting&lt;/h2&gt;
&lt;p&gt;The third and final area of monitoring relates to &lt;em&gt;everything else&lt;/em&gt; you might want to keep track of. Your team might want to know how many predictions the API has made in the past month, or how many times you've predicted a certain class. Maybe you, or some other team, is tracking KPIs that are highly related to your model. Maybe the finance dept. wants to know the new CTR based on your recommendations.&lt;/p&gt;
&lt;p&gt;The answers to these questions often come in the form of a dashboard.
&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/45/monitoring-sketch-complete.png" alt="Business monitoring" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;Fortunately, if we store our inputs and predictions in a flexible enough format (e.g., &lt;code&gt;json&lt;/code&gt;), we'll be able to take almost any off-the-shelf tool and make a dashboard. Designing &lt;a href="https://www.metabase.com/"&gt;Metabase&lt;/a&gt; dashboard from your historical &lt;code&gt;json&lt;/code&gt; predictions stored in BigQuery should be straightforward. You could also stick to Google Data Studio. Or use Excel. You get my point, you can use whatever. &lt;code&gt;json&lt;/code&gt; is flexible enough to be sorted by &lt;code&gt;created_at&lt;/code&gt; and pushed into a dashboard.&lt;/p&gt;
&lt;p&gt;Although often &lt;em&gt;overlooked&lt;/em&gt;, this third level of reporting will often debunk the whole "your model is a black box" paradigm. It allows other stakeholders to know exactly what is going on in your model.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;Yes. Adding all of these different types of monitoring requires &lt;em&gt;some&lt;/em&gt; boilerplate code. And of course, setting this up might take some time. &lt;em&gt;Especially&lt;/em&gt; if you do these to every model in production.&lt;/p&gt;
&lt;p&gt;However, it's certainly impressive just how much insight you can get from some OpenTelemetry auto-instrumentation and a background job to save all your predictions.&lt;/p&gt;
&lt;p&gt;Sufficiently monitoring your ML model increases your team's confidence when putting models out there. It's a bit like testing - you are now more sure things are happening as they should. With the third area of monitoring, you can then extend that feeling to the rest of the organization.&lt;/p&gt;
&lt;p&gt;"Machine learning is such a black box". Open some doors to the box, and let other people in.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Goodbye Apple Watch</title><link href="https://duarteocarmo.com/blog/coros-pace-2-apple-watch.html" rel="alternate"/><published>2022-10-29T12:00:00+02:00</published><updated>2022-10-29T12:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-10-29:/blog/coros-pace-2-apple-watch.html</id><summary type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/44/cover_small_optim.png" alt="Coros pace 2 outside shot" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;1768 days.&lt;/p&gt;
&lt;p&gt;That's how long I've used the Apple Watch. Just a little under 5 years, in case you're wondering. I really love it. I don't know about "future of computing" and all that, but it's definitely a device I've fallen in love with. Last week, for the first time …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/44/cover_small_optim.png" alt="Coros pace 2 outside shot" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;1768 days.&lt;/p&gt;
&lt;p&gt;That's how long I've used the Apple Watch. Just a little under 5 years, in case you're wondering. I really love it. I don't know about "future of computing" and all that, but it's definitely a device I've fallen in love with. Last week, for the first time, I ditched it. Let me explain why.&lt;/p&gt;
&lt;p&gt;My last marathon didn't go that well. A story for another day. In the aftermath, I've been thinking about my training process. What went well, what went wrong, you know, introspection and all that. So I did what I do best. I got too deep into the topic. I research how other people train. I explore the vast sea of resources about running. You come across some &lt;a href="https://fellrnr.com/wiki/Main_Page"&gt;pretty cool&lt;/a&gt; resources. And some &lt;a href="http://www.electricblues.com/html/runpro.html"&gt;scary&lt;/a&gt; ones.&lt;/p&gt;
&lt;p&gt;I was watching Kipchoge, and how he just &lt;a href="https://worldathletics.org/news/report/eliud-kipchoge-world-record-berlin-marathon-2022"&gt;broke the WR in Berlin&lt;/a&gt;. What a guy. You know what watch he uses for running? It's called &lt;a href="https://www.coros.com/pace2"&gt;Coros Pace 2&lt;/a&gt;. I'm far from a Kipchoge, but for 200 EUR, I had to know more. After some weeks of &lt;em&gt;e-window&lt;/em&gt; shopping and a lot of YouTube reviews, I ordered it.&lt;/p&gt;
&lt;h2 id="the-good"&gt;The good&lt;/h2&gt;
&lt;p&gt;20 days. That's how much this watch can last in a single charge. Yes, you read it right. &lt;em&gt;20 days&lt;/em&gt;. For someone who is used to charging an Apple Watch every single day, that's a pretty nice upgrade. I don't have to worry about charging it every night, I don't have to carry an extra cable every single time I travel. I just go, and it will last the whole trip.&lt;/p&gt;
&lt;p&gt;Obviously, I bought this watch for the fitness capabilities. Compared to the Apple Watch, it delivers. Setting up the Apple Watch for interval training, or any type of specific workout can be a pain. It doesn't even come &lt;em&gt;close&lt;/em&gt; to the Coros. The workout views are much more informative and the after-workout analyses are much more insightful. That'
s expected. What I didn't expect, is a full web-based training portal where I can track all my workout metrics. Now that's a nice surprise.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/44/traininghub.png" alt="Coros training hub" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;p&gt;The training hub and the Coros app (which is not &lt;em&gt;that&lt;/em&gt; bad) give me much better control of my training plan. I can plan 3 months in advance and design all my workouts exactly like I want. For every workout, Coros gives me the Aerobic and Anaerobic training effect, my effort, my performance, so I know exactly what went down.&lt;/p&gt;
&lt;p&gt;Sleep tracking. You know that thing you &lt;em&gt;sometimes&lt;/em&gt; do with your Apple Watch, but not &lt;em&gt;too&lt;/em&gt; often since you don't know if you'll have time to charge in the morning? Yup. Since I don't have to charge the Coros overnight, I get access to all my Sleep stats. I know exactly how much and how well I've slept over time. You know, 'cause sleep &lt;a href="https://www.nature.com/articles/4371207a"&gt;matters&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="the-bad"&gt;The bad&lt;/h2&gt;
&lt;p&gt;It's not an Apple Watch. Yes, I could write about the fact that some parts of the Coros app are kinda rough, or the fact that the watch interface is a bit clunky. Or even the fact that it uses some sort of weird proprietary charger. But the bottom line is, I keep comparing it to the Apple Watch.&lt;/p&gt;
&lt;p&gt;One side of me now understands why Apple has never given us custom watch faces. Coros has &lt;em&gt;at least&lt;/em&gt; 100 watch faces to chose from, and they all look… well,  not great. This watch is clearly missing the aesthetic appeal that the Apple Watch has.&lt;/p&gt;
&lt;p&gt;I have a love/hate relationship with Apple's beautiful walled garden of an ecosystem. How can I miss paying with my wrist so much? How can I miss controlling music playback on my wrist so much? Seeing those circles close in every widget I own. Getting all the notifications in your watch. The next event in your calendar at wrist glance. The "Do not disturb until I leave this location", two taps away. Even the haptic motor.&lt;/p&gt;
&lt;p&gt;I miss the cozy ecosystem. It's a bit like a withdrawal - it will fade. I think.&lt;/p&gt;
&lt;p&gt;&lt;center&gt;
&lt;img src="https://duarteocarmo.com/images/44/home.png" alt="Coros pace 2 outside shot at home" style="max-width:100%;border-radius: 2px"&gt;
&lt;/center&gt;&lt;/p&gt;
&lt;h2 id="the-end"&gt;The end&lt;/h2&gt;
&lt;p&gt;I don't think there's a winner. They're &lt;em&gt;different&lt;/em&gt; devices - with different goals, and different target audiences. One, focuses on design, ecosystem, wellness, and lifestyle. The other is a powerful fitness tracker, and does it very well.&lt;/p&gt;
&lt;p&gt;I &lt;em&gt;could&lt;/em&gt; just use both. But I don't want to add that stress to my life. For now, a good run brings me more joy than paying with my wrist. A good night of sleep is more important than seeing all my calendar events in my wrist. &lt;em&gt;Focus&lt;/em&gt;, is more important than constant notifications.&lt;/p&gt;
&lt;p&gt;So I'll stick to the Coros for now.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>infrequent.app - stay in touch with those who matter</title><link href="https://duarteocarmo.com/blog/infrequent.html" rel="alternate"/><published>2022-10-23T19:00:00+02:00</published><updated>2022-10-23T19:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-10-23:/blog/infrequent.html</id><summary type="html">&lt;p&gt;Some months ago, I &lt;a href="infrequent-tiny-crm.html"&gt;built&lt;/a&gt; a small script to help me stay in touch with those who matter to me. The concept is fairly simple: I want to stay in touch with person &lt;em&gt;p&lt;/em&gt; every &lt;em&gt;t&lt;/em&gt; period of time. If &lt;em&gt;t&lt;/em&gt; has passed, and you haven't talked to &lt;em&gt;p&lt;/em&gt;, you …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Some months ago, I &lt;a href="infrequent-tiny-crm.html"&gt;built&lt;/a&gt; a small script to help me stay in touch with those who matter to me. The concept is fairly simple: I want to stay in touch with person &lt;em&gt;p&lt;/em&gt; every &lt;em&gt;t&lt;/em&gt; period of time. If &lt;em&gt;t&lt;/em&gt; has passed, and you haven't talked to &lt;em&gt;p&lt;/em&gt;, you get a reminder.&lt;/p&gt;
&lt;p&gt;I've been using this script/system for the past 5 months, it has served me well. I've been keeping up with mentors, ex-managers, family, and friends - without much effort. Which is the goal. It's not that I don't like editing a bunch of markdown files every week - but I wanted something &lt;em&gt;smoother&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Also, the more I've told people about my system (which is &lt;a href="https://sive.rs/hundreds"&gt;not even mine&lt;/a&gt;), the more I've realized that I'm not the only one who needs something like this. Generally, people have a &lt;em&gt;hard&lt;/em&gt; time keeping in touch. It's not that we don't care - it's that life gets in the way.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://infrequent.app"&gt;
&lt;img src="https://duarteocarmo.com/images/43/website.png" alt="Infrequent dashboard" style="max-width:100%;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And so I built &lt;a href="https://infrequent.app"&gt;infrequent&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Infrequent is a system to stay in touch with those I care about. A system that I built &lt;em&gt;literally&lt;/em&gt; for myself, to scratch my own itch. But also a system that I hope helps others out there.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://infrequent.app"&gt;
&lt;img src="https://duarteocarmo.com/images/43/dashboard.png" alt="Infrequent dashboard" style="max-width:100%;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I'm not gonna lie, it's been pretty fun to get back into web development. I love AI&amp;amp;ML, but there's something &lt;em&gt;special&lt;/em&gt; about web development. And &lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt;, &lt;a href="https://htmx.org/"&gt;htmx&lt;/a&gt;, and &lt;a href="https://tailwindcss.com/"&gt;tailwind&lt;/a&gt; have been a joy to work with. The stack is not straightforward to set up, but once you get going, it's a breeze to work with. Django is an &lt;em&gt;incredibly&lt;/em&gt; powerful framework.&lt;/p&gt;
&lt;p&gt;I don't have million-dollar plans for infrequent just yet. For now, I plan to continue using it, and entice others to do so too. I'll keep adding features to it, as mine (and maybe your) needs evolve. Feel free to give it a spin, and don't hesitate to &lt;a href="mailto:me@duarteocarmo.com"&gt;email me&lt;/a&gt; if you like it, or think something is missing.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Monitoring ML models with FastAPI and Evidently AI</title><link href="https://duarteocarmo.com/blog/monitoring-ml-models-fastapi-evidently.html" rel="alternate"/><published>2022-09-17T15:00:00+02:00</published><updated>2022-09-17T15:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-09-17:/blog/monitoring-ml-models-fastapi-evidently.html</id><summary type="html">&lt;p&gt;I've deployed a good amount of ML models to production. For many, deployment to production is the last step of the process. Once that's done, work is done. This is far from true. Once your model is out there, problems &lt;em&gt;will&lt;/em&gt; start to arise. Some predictions will be wrong. Some …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I've deployed a good amount of ML models to production. For many, deployment to production is the last step of the process. Once that's done, work is done. This is far from true. Once your model is out there, problems &lt;em&gt;will&lt;/em&gt; start to arise. Some predictions will be wrong. Some labels will occur more often than they should, and some examples &lt;em&gt;will&lt;/em&gt; surprise the model.&lt;/p&gt;
&lt;p&gt;Setting up the right lenses &amp;amp; triggers &lt;em&gt;into&lt;/em&gt; your model is critical. It helps to ensure everything is running smoothly, or to know when issues need to be tackled. Let's open up the black box.&lt;/p&gt;
&lt;h2 id="setting-up-fastapi"&gt;Setting up FastAPI&lt;/h2&gt;
&lt;p&gt;I've previously &lt;a href="/blog/serving-ml-models-fastapi.html"&gt;written&lt;/a&gt; about how to serve your ML model with FastAPI. Let's assume we're serving our model with FastAPI and our &lt;code&gt;src&lt;/code&gt; folder looks something like:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;src
├──&lt;span class="w"&gt; &lt;/span&gt;__init__.py
├──&lt;span class="w"&gt; &lt;/span&gt;api&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;Fast&lt;span class="w"&gt; &lt;/span&gt;API&lt;span class="w"&gt; &lt;/span&gt;folder
│&lt;span class="w"&gt;   &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;__init__.py
│&lt;span class="w"&gt;   &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;models.py
│&lt;span class="w"&gt;   &lt;/span&gt;└──&lt;span class="w"&gt; &lt;/span&gt;main.py
├──&lt;span class="w"&gt; &lt;/span&gt;pipeline
│&lt;span class="w"&gt;   &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;__init__.py
│&lt;span class="w"&gt;   &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;...
│&lt;span class="w"&gt;   &lt;/span&gt;└──&lt;span class="w"&gt; &lt;/span&gt;train.py
└──&lt;span class="w"&gt; &lt;/span&gt;setup.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;main.py&lt;/code&gt; file is where our views are defined. Below, our main file, with a &lt;code&gt;/predict&lt;/code&gt; endpoint, that serves predictions.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# app.py&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;src.pipeline&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_prediction_for&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;joblib&lt;/span&gt;

&lt;span class="c1"&gt;# create FastAPI app and load model&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;joblib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;model.joblib&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create an endpoint that receives POST requests&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reponse_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# some processing&lt;/span&gt;
    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_prediction_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prediction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="prerequisite-storing-all-predictions"&gt;Prerequisite: Storing all predictions&lt;/h2&gt;
&lt;p&gt;To monitor our model, we must first make sure two things are happening:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Prediction logging:  We're actively logging all predictions our model is making&lt;/li&gt;
&lt;li&gt;Access to a reference dataset: We have access to the dataset where our model was trained  (e.g., the training data)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Logging all predictions your model makes can be done using some managed database service (e.g., think Aurora, BigQuery, etc.).  Ideally, we want to do this without increasing our prediction latency.&lt;/p&gt;
&lt;p&gt;Fortunately, Fast API provides gives a great tool to do this: &lt;code&gt;BackgroundTasks&lt;/code&gt;. We start by creating a function that saves our data (in this example, to BigQuery):&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# monitoring.py&lt;/span&gt;

&lt;span class="c1"&gt;# ...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_to_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="sd"&gt;    Saves input/output dicts to bigquery&lt;/span&gt;
&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BigQuery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;your_cool_bq_table&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;rows_to_insert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;current_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())]&lt;/span&gt;
    &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;insert_rows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                &lt;span class="n"&gt;rows_to_insert&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Saved prediction&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can now add it to our API as a background task:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# app.py&lt;/span&gt;

&lt;span class="c1"&gt;# ...&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;.monitoring&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;save_to_database&lt;/span&gt;

&lt;span class="c1"&gt;# ...&lt;/span&gt;

&lt;span class="c1"&gt;# create an endpoint that receives POST requests&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="n"&gt;reponse_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# some processing&lt;/span&gt;
    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_prediction_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;save_to_bq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prediction&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Notice how the background task does not block the prediction time. Allowing us to keep prediction latency as low as possible, while &lt;em&gt;still&lt;/em&gt;, saving all predictions.&lt;/p&gt;
&lt;h2 id="setting-up-the-monitoring"&gt;Setting up the monitoring&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://docs.evidentlyai.com/"&gt;Evidently&lt;/a&gt; is a great open-source tool that allows you to set up monitoring for your ML models. It's not the only one, there's a myriad of them, actually. &lt;a href="https://www.nannyml.com/"&gt;nannyML&lt;/a&gt; is another one.&lt;/p&gt;
&lt;p&gt;Evidently allows you to generate a &lt;a href="https://docs.evidentlyai.com/features/dashboards/input_data#dataset-structure"&gt;bunch of different reports&lt;/a&gt; you can generate. In this example, I'll focus on the Data Drift dashboard.&lt;/p&gt;
&lt;p&gt;The Data Drift dashboard allows you to measure the difference in distribution between the predictions you are making, and the labels of your training set. When these two start to become &lt;em&gt;significantly&lt;/em&gt; different, you are likely encountering some drift.&lt;/p&gt;
&lt;p&gt;Alright, let's build it. We start by creating a couple of functions in our &lt;code&gt;monitoring.py&lt;/code&gt; module:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# ... rest of the monitoring.py&lt;/span&gt;

&lt;span class="n"&gt;DATA_WINDOW_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt; &lt;span class="c1"&gt;# how many predictions to load&lt;/span&gt;

&lt;span class="c1"&gt;# loads our training/reference dataset&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_train_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;train_file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;static/train_data.csv&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;train_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;train_df&lt;/span&gt;

&lt;span class="c1"&gt;# loads our latest predictions&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_last_predictions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="s2"&gt;    SELECT created_at, input, output&lt;/span&gt;
&lt;span class="s2"&gt;    FROM `my_cool_bgq_table`&lt;/span&gt;
&lt;span class="s2"&gt;    ORDER BY created_at DESC&lt;/span&gt;
&lt;span class="s2"&gt;    LIMIT &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;DATA_WINDOW_SIZE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;;&lt;/span&gt;
&lt;span class="s2"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;prediction_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_gbq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;prediction_data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now that we're able to fetch both our reference data and our past predictions, we're ready to build our Data Drift dashboard:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# ... rest of the monitoring.py&lt;/span&gt;

&lt;span class="c1"&gt;# this function generates a dashboard from our reference and prediction data&lt;/span&gt;

&lt;span class="c1"&gt;# which is then saved to a `drift.html` file&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_dashboard&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dasboard_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;static/drift.html&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;data_drift_dashboard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dashboard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;tabs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;DataDriftTab&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;verbose_level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;reference_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;load_reference_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;current_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;load_last_predictions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;data_drift_dashboard&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;calculate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;reference_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;reference_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;current_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;current_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;column_mapping&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;data_drift_dashboard&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dasboard_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Dashboard saved to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dasboard_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dasboard_name&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Notice how we're creating our dashboard, and then saving it to a &lt;code&gt;static/drift.html&lt;/code&gt; file. The idea is then to serve this dashboard in one of our FastAPI endpoints.&lt;/p&gt;
&lt;h2 id="monitoring-dashboard"&gt;Monitoring dashboard&lt;/h2&gt;
&lt;p&gt;Let's serve our data drift dashboard:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;.monitoring&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;generate_dashboard&lt;/span&gt;

&lt;span class="c1"&gt;# ... rest of the main.py&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/monitoring&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Other&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;monitoring&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;dashboard_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_dashboard&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;FileResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dashboard_location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Every time we visit &lt;code&gt;/monitoring&lt;/code&gt;, Fast API will run the &lt;code&gt;generate_dashboard&lt;/code&gt; function, and return an &lt;code&gt;html&lt;/code&gt; file:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/42/dashboard.png" alt="Monitoring Dashboard" style="max-width:100%;"&gt;&lt;/p&gt;
&lt;p&gt;As you can see, this dashboard compares the distribution of our reference and &lt;em&gt;current&lt;/em&gt; dataset. The current dataset being the latest Y predictions we've made.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;I've found this to be a &lt;em&gt;relatively&lt;/em&gt;  straightforward way of adding a bit of visibility to what's really happening in my production models. If those distributions are looking particularly skewed: you know it's time to act.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://evidentlyai.com/"&gt;Evidently&lt;/a&gt; allows us to generate much more than just a data drift dashboard. You can also generate dashboards to monitor data quality, the performance of a regression, classification performance, and many more. It's worth taking a look at their docs to see what fits your use case best.&lt;/p&gt;
&lt;p&gt;There's a way we could increase the speed here. Instead of computing the entire dashboard every time we visit &lt;code&gt;/monitoring&lt;/code&gt;, we &lt;a href="https://fastapi-utils.davidmontague.xyz/user-guide/repeated-tasks/#the-repeat_every-decorator"&gt;could&lt;/a&gt; compute it every X time period in the &lt;em&gt;background&lt;/em&gt;. This would result in much faster response from the &lt;code&gt;/monitoring&lt;/code&gt; endpoint.&lt;/p&gt;
&lt;p&gt;Is this dashboard enough to make sure everything is going well in production? No. But it's a &lt;em&gt;great&lt;/em&gt; first step towards figuring out what's &lt;em&gt;really&lt;/em&gt; going on.&lt;/p&gt;
&lt;hr&gt;

&lt;h4 id="updates-notes"&gt;Updates &amp;amp; notes&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://mlops.community/monitoring-ml-models-with-fastapi-and-evidently-ai/"&gt;This post was republished by the MLOps Community blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;I also wrote a &lt;a href="/blog/monitoring-machine-learning-apis.html"&gt;more general post&lt;/a&gt; on how to monitor machine learning APIs&lt;/li&gt;
&lt;/ul&gt;</content><category term="blog"/></entry><entry><title>Down from the Cloud</title><link href="https://duarteocarmo.com/blog/down-from-the-cloud-self-hosting.html" rel="alternate"/><published>2022-08-05T11:00:00+02:00</published><updated>2022-08-05T11:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-08-05:/blog/down-from-the-cloud-self-hosting.html</id><summary type="html">&lt;p&gt;I love the Cloud. For years, I've been deploying software to it. Azure, GCP, AWS, you name it. I've used most of them. To be honest, they're the same pig, but with different lipstick (like the Danes say). Package up your app, make some CI/CD magic, select your service …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I love the Cloud. For years, I've been deploying software to it. Azure, GCP, AWS, you name it. I've used most of them. To be honest, they're the same pig, but with different lipstick (like the Danes say). Package up your app, make some CI/CD magic, select your service, and I'm good to go. Still remember when I helped a whole bank move to the Public Cloud. The benefits are mostly obvious.&lt;/p&gt;
&lt;p&gt;But I'm no bank. And it's not all rainbows and unicorns. For starters, those 10 elastic beanstalk applications can start adding up. What if I want to change from Cloud A to Cloud B? It's not like Amazon is going to make my life easy. It's a convenience/lock-in trade-off. What about user data? What about &lt;em&gt;my&lt;/em&gt; data? Some of the &lt;a href="https://marco.org/2014/02/23/the-value-of-background-fetch"&gt;best&lt;/a&gt; &lt;a href="https://twitter.com/levelsio/status/1308145873843560449"&gt;software&lt;/a&gt; I use doesn't even use the Cloud.&lt;/p&gt;
&lt;p&gt;Let's come down from the Cloud (at least the public one): Self-hosting. How &lt;em&gt;hard&lt;/em&gt; is it, really?&lt;/p&gt;
&lt;h2 id="get-a-server"&gt;Get a server&lt;/h2&gt;
&lt;p&gt;This one should be straightforward. There's no shortage of server providers. Just browsed &lt;a href="https://lowendbox.com/"&gt;LowEndBox&lt;/a&gt; for a while I decided to go with a dedicated server from &lt;a href="https://www.hetzner.com/"&gt;Hetzner&lt;/a&gt;. Something relatively close to Copenhagen to ensure my connection to the server is snappy enough.&lt;/p&gt;
&lt;p&gt;I don't run a lot of very high traffic sites (yet!). Something around the 50 EUR/month price point serves just fine. You'll be surprised with how much stuff you can run in this machine. I have 10 services running and my memory barely goes above 10%. Let's not jinx it.&lt;/p&gt;
&lt;h2 id="containerize-all-the-things"&gt;Containerize all the things&lt;/h2&gt;
&lt;p&gt;I have a love-hate relationship with Docker. For me, it's still the most straightforward way of exchanging and packaging up software to ensure nothing breaks. But how I hate when my container starts getting fat. I'm talking about you, PyTorch.&lt;/p&gt;
&lt;p&gt;Every application will run inside its own containerized environment. To ensure this doesn't get overly complicated, I went with &lt;code&gt;docker compose&lt;/code&gt;. Why not Kubernetes you may ask? Because I have other things to do.&lt;/p&gt;
&lt;p&gt;Inside the server, things look a little something like:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;Caddyfile&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;more&lt;span class="w"&gt; &lt;/span&gt;on&lt;span class="w"&gt; &lt;/span&gt;this&lt;span class="w"&gt; &lt;/span&gt;later
&lt;span class="sb"&gt;`&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;projects
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;project-1&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;first&lt;span class="w"&gt; &lt;/span&gt;application
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;...
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;Dockerfile
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;docker-compose.yml&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;docker&lt;span class="w"&gt; &lt;/span&gt;compose&lt;span class="w"&gt; &lt;/span&gt;file
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;project-2&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;second&lt;span class="w"&gt; &lt;/span&gt;application
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;...
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;Dockerfile
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;docker-compose.yml&lt;span class="w"&gt; &lt;/span&gt;&amp;lt;-&lt;span class="w"&gt; &lt;/span&gt;second&lt;span class="w"&gt; &lt;/span&gt;docker&lt;span class="w"&gt; &lt;/span&gt;compose&lt;span class="w"&gt; &lt;/span&gt;file
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Every project gets its own &lt;code&gt;docker-compose.yml&lt;/code&gt; file. Every project runs on its specific server port. For example, to run &lt;code&gt;project-1&lt;/code&gt;, I &lt;code&gt;cd&lt;/code&gt; into the directory and start the container: &lt;code&gt;docker compose up --force-recreate --build -d&lt;/code&gt;. This automatically starts the service on the port I specify on the &lt;code&gt;docker-compose.yml&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;Using the process above, I can have a bunch of different applications running on a lot of different ports, but in a single server. Hopefully (1) saving money, and (2) increasing the power and control over them.&lt;/p&gt;
&lt;p&gt;Hopefully, I won't land on dependency-nightmare-land.&lt;/p&gt;
&lt;h2 id="continuously-deploy"&gt;Continuously deploy&lt;/h2&gt;
&lt;p&gt;Self-hosting sounds great. Having to &lt;code&gt;ssh&lt;/code&gt; into a server &lt;em&gt;every time&lt;/em&gt; I want to update a service, doesn't. If there's one thing I'm not willing to compromise on, it's continuous deployments. The whole &lt;code&gt;git push&lt;/code&gt; automatically updates the app thing is super convenient.&lt;/p&gt;
&lt;p&gt;Enter GitHub actions. With them, you can automate the ssh'ing and re-deployment part. It's not the most &lt;em&gt;secure&lt;/em&gt; option. But hey, it works pretty well. Surprisingly, it's also much faster than deploying on Elastic Beanstalk or Cloud Run.&lt;/p&gt;
&lt;p&gt;Here's what these actions look like:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Deploy to server&lt;/span&gt;
&lt;span class="nt"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;push&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nt"&gt;branches&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;master&lt;/span&gt;
&lt;span class="nt"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;build&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;runs-on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ubuntu-latest&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Checkout source code&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;actions/checkout@v1&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;actions/checkout@master&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Copies repository to server&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;appleboy/scp-action@master&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;with&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.HOST }} &amp;lt;- these are defined in your repo settings.&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.USERNAME }}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.PASSWORD }}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;22&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;overwrite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;.&amp;quot;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/root/projects/project-X&amp;quot;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;uses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;appleboy/ssh-action@master&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;Stops and updates docker container as deamon&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;with&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.HOST }}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.USERNAME }}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;${{ secrets.PASSWORD }}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;22&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p p-Indicator"&gt;|&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="no"&gt;cd projects/project-X&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="no"&gt;docker compose down&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="no"&gt;docker compose up --force-recreate --build -d&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="no"&gt;docker ps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h2 id="https-we-meet-again"&gt;HTTPS, we meet again&lt;/h2&gt;
&lt;p&gt;Alright, it's time to expose these apps to the world. I'll admit it. The sheer mentioning of &lt;em&gt;https certificates&lt;/em&gt; or &lt;em&gt;DNS&lt;/em&gt; runs a chill down my spine. After loosing a couple of hours in the &lt;a href="https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/"&gt;reverse proxy documentation for Nginx&lt;/a&gt; and getting a bit mad, I found this little thing called &lt;a href="https://caddyserver.com/"&gt;Caddy&lt;/a&gt;. Oh boy, that was easy.&lt;/p&gt;
&lt;p&gt;Caddy solves the whole subdomain, reverse proxy, multiple websites on the same server, HTTPS &lt;em&gt;mambo jumbo&lt;/em&gt;. The docs are great, and it's super straightforward to use. I definitely recommend their &lt;a href="https://caddyserver.com/docs/getting-started"&gt;Getting Started&lt;/a&gt; docs.&lt;/p&gt;
&lt;p&gt;First step is to point some A record in your DNS to your server's IP address. I use Cloudflare for this, do recommend. Once Caddy is installed in the server, all you need to do is create a &lt;code&gt;Caddyfile&lt;/code&gt;:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;project-1.mydomain.com {
        reverse_proxy localhost:3000
}

project-2.mydomain.com {
        reverse_proxy localhost:5000
}

home.mydomain.com {
        respond &amp;quot;Base domain&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;After that, run &lt;code&gt;caddy start&lt;/code&gt;, and we're done. Yes. Your eyes are not kidding you. That's absolutely it. &lt;code&gt;project-1.mydomain.com&lt;/code&gt; is connected to my app running on port &lt;code&gt;3000&lt;/code&gt;, and &lt;code&gt;project-2.mydomain.com&lt;/code&gt; is connected to the container on port &lt;code&gt;5000&lt;/code&gt;. HTTPS? Automatic. Reverse proxy? Done.&lt;/p&gt;
&lt;p&gt;Caddy automatically enables HTTPS for all of the apps you specify on your &lt;code&gt;Caddyfile&lt;/code&gt;. It also automatically issues new certificates and renews them. Changes? A simple &lt;code&gt;caddy reload&lt;/code&gt; and you're up and running again.&lt;/p&gt;
&lt;p&gt;I can have as many subdomains/domains pointing to the same server as I want, as long as the server can handle it.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/41/glances.png" alt="Glances" style="max-width:100%;"&gt;&lt;/p&gt;
&lt;p&gt;One of the services I'm currently self hosting is called &lt;a href="https://nicolargo.github.io/glances/"&gt;Glances&lt;/a&gt;. With it, all of my server's metrics an url away. With about 5 containers, the CPU is still below 10%.&lt;/p&gt;
&lt;p&gt;No. I don't have all the elasticity that the Public Cloud gives me. But do I need it? The deployment method is the same (&lt;code&gt;git push&lt;/code&gt;), the speed of services are the same (mostly static with Cloudflare on top), the cost is lower, and data is controlled by me. Sounds like a treat.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>A recipe recommendation system</title><link href="https://duarteocarmo.com/blog/scandinavia-food-python-recommendation-systems.html" rel="alternate"/><published>2022-06-26T11:55:00+02:00</published><updated>2022-06-26T11:55:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-06-26:/blog/scandinavia-food-python-recommendation-systems.html</id><summary type="html">&lt;p&gt;You arrive home and turn on Netflix. On the front page: hundreds of shows. They're not ordered randomly. Netflix has been perfecting the science of recommending shows for years now. Google has perfected the science of sorting search results.&lt;/p&gt;
&lt;p&gt;To this day, no one knows exactly how Google sorts their …&lt;/p&gt;</summary><content type="html">&lt;p&gt;You arrive home and turn on Netflix. On the front page: hundreds of shows. They're not ordered randomly. Netflix has been perfecting the science of recommending shows for years now. Google has perfected the science of sorting search results.&lt;/p&gt;
&lt;p&gt;To this day, no one knows exactly how Google sorts their search results. Recommendation systems are gold, and a big part of the success of big tech.&lt;/p&gt;
&lt;p&gt;Yeah, this article is not about food. It's about &lt;em&gt;recommender systems&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;While building one, my assumption that Scandinavian food was not for me, was debunked. Turns out, it's one of my favourites (at least according to what I built). Hard to believe?&lt;/p&gt;
&lt;h2 id="on-cooking"&gt;On cooking&lt;/h2&gt;
&lt;p&gt;I'm not a fan of the culinary arts (the opposite really). But I decided to build a dataset of recipes. 160.000 recipes. Each one has a name, a region, a link, and ingredients. I scraped them from &lt;a href="https://cosylab.iiitd.edu.in/recipedb/"&gt;RecipeDB&lt;/a&gt;. RecipeDB scrapped them from somewhere else - the data is &lt;em&gt;ok-ish&lt;/em&gt; quality.&lt;/p&gt;
&lt;p&gt;Here's a sample:&lt;/p&gt;
&lt;p&gt;&lt;img alt="table data" style="border-radius: 5px;" src="https://duarteocarmo.com/images/40/table.png"&gt;&lt;/p&gt;
&lt;p&gt;And here's there distribution of recipes across regions:&lt;/p&gt;
&lt;p&gt;&lt;img alt="value counts" style="border-radius: 5px;" src="https://duarteocarmo.com/images/40/value_counts.png"&gt;&lt;/p&gt;
&lt;h2 id="vectors-are-dope-more-of-those"&gt;Vectors are dope, more of those&lt;/h2&gt;
&lt;p&gt;Data is messy. Images, text, audio... They come in thousands of formats. But there's a common language to every data type: &lt;strong&gt;vectors&lt;/strong&gt; (or arrays, whatever).&lt;/p&gt;
&lt;p&gt;In machine learning, vectors that represent data are referred to as &lt;em&gt;embeddings&lt;/em&gt;. Transforming a word or a sentence in to an embedding it straightforward nowadays:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;flair.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WordEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;flair.data&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Sentence&lt;/span&gt;

&lt;span class="c1"&gt;# create embedder Glove in this case&lt;/span&gt;

&lt;span class="n"&gt;glove_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WordEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;glove&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create a sentemce&lt;/span&gt;

&lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Sentence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;pineapple pizza&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# embed a sentence&lt;/span&gt;

&lt;span class="n"&gt;glove_embedding&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# you can get the embedding for the whole sentence&lt;/span&gt;

&lt;span class="n"&gt;sentence_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;

&lt;span class="c1"&gt;# or check the embedding for each word (e.g., token)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# prints...&lt;/span&gt;

&lt;span class="c1"&gt;# Token[0]: &amp;quot;pineapple&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# tensor([-1.3641,  0.2901, ...])&lt;/span&gt;

&lt;span class="c1"&gt;# Token[0]: &amp;quot;pineapple&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# tensor([-1.3641,  0.2901, ...])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Getting the vector for a sentence, is done by averaging the vectors of the words inside of it. We could use also BERT to &lt;a href="https://www.sbert.net/"&gt;generate&lt;/a&gt; embeddings for sentences.&lt;/p&gt;
&lt;p&gt;There's one big advantage of these pertained models like BERT: the embeddings preserve the semantic meaning of the words. So, in theory, two sentences with similar meanings, &lt;em&gt;should&lt;/em&gt; have similar vectors.&lt;/p&gt;
&lt;p&gt;These fancy embeddings take time to compute. In the interest of time, I went with &lt;em&gt;&lt;a href="https://nlp.stanford.edu/projects/glove/"&gt;Glove&lt;/a&gt;&lt;/em&gt; to calculate them.&lt;/p&gt;
&lt;h2 id="a-galaxy-of-recipes"&gt;A galaxy of recipes&lt;/h2&gt;
&lt;p&gt;Let's leverage the technique above to embed all the recipes in our dataset. We take the &lt;code&gt;ingredient_string&lt;/code&gt; of each recipe, and use Glove to generate a matrix of embeddings:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# create embeddings for all of them&lt;/span&gt;

&lt;span class="n"&gt;sentence_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Sentence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ingredient_string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;document_embeddings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sentence_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;embedding_matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sentence_list&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_matrix&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# prints (115657, 100)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;That's a pretty large matrix. 115657 vectors, with 100 dimensions each one. Now, we can visualise 3 dimensions - but not 100.&lt;/p&gt;
&lt;p&gt;To visualise them, we need to reduce the dimensionality of these vectors. UMAP does just that:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# let&amp;#39;s use UMAP to help us visualize these vectors&lt;/span&gt;

&lt;span class="c1"&gt;# you could use t-SNE, PCA, etc.&lt;/span&gt;

&lt;span class="n"&gt;umap_fit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;umap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UMAP&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;umap_matrix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;umap_fit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding_matrix&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;umap_matrix&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# (115657, 2)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Alright, now we have a 2D matrix of our 116.000 recipes. That's &lt;em&gt;visualisable&lt;/em&gt;. For the sake of your browser, here's a random set of 2000 recipes in 2D space:&lt;/p&gt;
&lt;div id="vis-1"&gt;&lt;/div&gt;
&lt;script&gt;
(function(vegaEmbed) {
var spec = {"config": {"view": {"continuousWidth": 400, "continuousHeight": 300}}, "data": {"name": "data-87be018bf2fe12092f9c6c0475c82a77"}, "mark": "point", "encoding": {"color": {"field": "region", "type": "nominal"}, "tooltip": [{"field": "recipe", "type": "nominal"}, {"field": "region", "type": "nominal"}, {"field": "url", "type": "nominal"}], "x": {"field": "umap_0", "type": "quantitative"}, "y": {"field": "umap_1", "type": "quantitative"}}, "height": 500, "selection": {"selector001": {"type": "interval", "bind": "scales", "encodings": ["x", "y"]}}, "title": "Visualizing 2000 recipes", "width": 500, "$schema": "https://vega.github.io/schema/vega-lite/v4.17.0.json", "datasets": {"data-87be018bf2fe12092f9c6c0475c82a77": [{"umap_0": -1.3918496370315552, "umap_1": 11.757913589477539, "region": "Mexican", "recipe": "Salsa Del Lopez", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83033"}, {"umap_0": -1.0309573411941528, "umap_1": 9.523167610168457, "region": "French", "recipe": "Crock Pot Chicken Provencal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116730"}, {"umap_0": -1.2424802780151367, "umap_1": 6.40226936340332, "region": "Deutschland", "recipe": "Swiss Steak for Two (Crock Pot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117709"}, {"umap_0": 10.127331733703613, "umap_1": 5.45508337020874, "region": "Eastern European", "recipe": "Drop Scones (So Easy)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133616"}, {"umap_0": -1.9859997034072876, "umap_1": 5.981219291687012, "region": "Australian", "recipe": "Greek Veggie Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5164"}, {"umap_0": -2.2660439014434814, "umap_1": 7.329191207885742, "region": "Greek", "recipe": "Grilled Mediterranean Salad Stuffed Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111022"}, {"umap_0": -2.0031380653381348, "umap_1": 8.42020320892334, "region": "Italian", "recipe": "Insalata Primavera (Italian Spring Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120536"}, {"umap_0": 3.126434564590454, "umap_1": 8.945534706115723, "region": "Chinese and Mongolian", "recipe": "Chinese Pork Chops (Diabetic)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56844"}, {"umap_0": 9.458781242370605, "umap_1": 7.472686290740967, "region": "Canadian", "recipe": "Baked Rice Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144214"}, {"umap_0": 0.10703326016664505, "umap_1": 9.786750793457031, "region": "Mexican", "recipe": "Chicken Enchiladas With Roasted Tomatillo Chile Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88978"}, {"umap_0": -1.4782341718673706, "umap_1": 8.419039726257324, "region": "Greek", "recipe": "Bobby Flay's Kentucky Ham Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109617"}, {"umap_0": 0.31108716130256653, "umap_1": 4.9014973640441895, "region": "South American", "recipe": "Easy Supreme Beef Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99412"}, {"umap_0": 3.222634792327881, "umap_1": 9.869194984436035, "region": "Thai", "recipe": "Ginger Broccoli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59579"}, {"umap_0": -1.4622187614440918, "umap_1": 5.559756278991699, "region": "US", "recipe": "Pittsburgh Football Sunday Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13975"}, {"umap_0": 1.8007495403289795, "umap_1": 5.676237106323242, "region": "Mexican", "recipe": "Chipotle Sandwich Wrap", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78116"}, {"umap_0": -0.52838134765625, "umap_1": 10.330512046813965, "region": "Caribbean", "recipe": "Barbecued Pig", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8039"}, {"umap_0": 8.998263359069824, "umap_1": 7.855962753295898, "region": "Indian Subcontinent", "recipe": "Silver Moon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67682"}, {"umap_0": 0.10560809075832367, "umap_1": 9.160552024841309, "region": "US", "recipe": "Big B's Collard Greens", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16113"}, {"umap_0": -0.3880225718021393, "umap_1": 5.414238452911377, "region": "Deutschland", "recipe": "Authentic Switzerland Fondue", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118188"}, {"umap_0": -0.6247455477714539, "umap_1": 7.495125770568848, "region": "Irish", "recipe": "Yummy Brussels Sprouts With Bacon &amp; Onion", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135938"}, {"umap_0": 1.7616580724716187, "umap_1": 4.886806011199951, "region": "Mexican", "recipe": "Mandy's Lamb Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6121"}, {"umap_0": 11.793554306030273, "umap_1": 8.38292407989502, "region": "Rest Africa", "recipe": "Peppermint Crisp Tart - Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51420"}, {"umap_0": 0.8003747463226318, "umap_1": 12.410970687866211, "region": "Indian Subcontinent", "recipe": "Taj Maholla! Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64042"}, {"umap_0": 0.9994710087776184, "umap_1": 8.146405220031738, "region": "Southeast Asian", "recipe": "Filipino Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63370"}, {"umap_0": 10.199616432189941, "umap_1": 10.071054458618164, "region": "Central American", "recipe": "Killer Sangria", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92985"}, {"umap_0": 1.4438186883926392, "umap_1": 5.494691371917725, "region": "Mexican", "recipe": "Mexican Omelet Roll-Ups With Avocado Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81023"}, {"umap_0": -0.7491047382354736, "umap_1": 4.932082176208496, "region": "Spanish and Portuguese", "recipe": "Potato and Leek Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142057"}, {"umap_0": 9.100556373596191, "umap_1": 9.097440719604492, "region": "Rest Africa", "recipe": "Clementines in Ginger Syrup, Algerian", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48573"}, {"umap_0": -1.7490047216415405, "umap_1": 8.839705467224121, "region": "Italian", "recipe": "Savino's Herb and Olive Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10408"}, {"umap_0": -0.7827649712562561, "umap_1": 11.357047080993652, "region": "French", "recipe": "Flavoured Vinegar", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112016"}, {"umap_0": -0.24351391196250916, "umap_1": 10.631551742553711, "region": "South American", "recipe": "Colombian Chicken Soup a La Emeril", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93821"}, {"umap_0": 0.2858981788158417, "umap_1": 9.938416481018066, "region": "Mexican", "recipe": "Chuckwagon Beans (Frijoles a La Charra)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77883"}, {"umap_0": -0.34065333008766174, "umap_1": 8.353398323059082, "region": "South American", "recipe": "Beef and Asparagus Roll-Ups", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95130"}, {"umap_0": 1.8140946626663208, "umap_1": 6.912238121032715, "region": "Italian", "recipe": "Italian Hot &amp; Spicy Oven Fried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127354"}, {"umap_0": -1.529252529144287, "umap_1": 6.6026506423950195, "region": "Italian", "recipe": "Italian 1 Pot Meal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126272"}, {"umap_0": 0.8462070226669312, "umap_1": 8.389432907104492, "region": "Deutschland", "recipe": "Swiss Steak Bh&amp;g Year 1937", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118043"}, {"umap_0": 11.02765941619873, "umap_1": 7.757143974304199, "region": "French", "recipe": "Godiva Marjolaine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114348"}, {"umap_0": 0.47562727332115173, "umap_1": 9.728699684143066, "region": "Mexican", "recipe": "Mexican Rice III", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6340"}, {"umap_0": 1.5462573766708374, "umap_1": 4.492384910583496, "region": "French", "recipe": "Taco Salad With Doritos &amp; French Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115615"}, {"umap_0": 0.2590952217578888, "umap_1": 10.38813591003418, "region": "Italian", "recipe": "Mexican Pesto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10900"}, {"umap_0": -1.8283843994140625, "umap_1": 9.035175323486328, "region": "Italian", "recipe": "Calamari for a King", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132357"}, {"umap_0": 2.496938943862915, "umap_1": 10.862029075622559, "region": "Southeast Asian", "recipe": "Bun Chay (Vietnamese Veggie Rice Vermicelli Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62370"}, {"umap_0": -1.1525416374206543, "umap_1": 5.221553325653076, "region": "Italian", "recipe": "Sorrento Stuffed Chicken Breasts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122560"}, {"umap_0": -2.449894428253174, "umap_1": 9.124031066894531, "region": "South American", "recipe": "Grilled Mesquite Beef Fajitas (Tex-Mex)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97756"}, {"umap_0": -1.6584727764129639, "umap_1": 12.030065536499023, "region": "US", "recipe": "Easy Guacamole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18256"}, {"umap_0": -0.9459820985794067, "umap_1": 5.569516658782959, "region": "Italian", "recipe": "Roasted Asparagus with Brown Butter and Pecorino", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119717"}, {"umap_0": 2.6394083499908447, "umap_1": 10.345782279968262, "region": "Rest Africa", "recipe": "Oluwombo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48916"}, {"umap_0": -2.517686605453491, "umap_1": 8.030699729919434, "region": "Greek", "recipe": "Best Greek Quinoa Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9252"}, {"umap_0": 0.5090325474739075, "umap_1": 12.251130104064941, "region": "Indian Subcontinent", "recipe": "Lamb Sukuti (Crispy Smoked Lamb Marinated in Nepali Spices)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59612"}, {"umap_0": 11.56545639038086, "umap_1": 10.310689926147461, "region": "Caribbean", "recipe": "Caribbean Cassis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90958"}, {"umap_0": -0.14634838700294495, "umap_1": 4.6056599617004395, "region": "Canadian", "recipe": "Bacon Cheddar Deviled Eggs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145752"}, {"umap_0": 1.8354699611663818, "umap_1": 7.335993766784668, "region": "Northern Africa", "recipe": "Chicken Croquettes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49792"}, {"umap_0": 1.5767450332641602, "umap_1": 9.15180778503418, "region": "Belgian", "recipe": "Dutch Baked Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134503"}, {"umap_0": 10.784056663513184, "umap_1": 7.630938529968262, "region": "US", "recipe": "Brigid's Blackberry Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14165"}, {"umap_0": 0.8465257287025452, "umap_1": 4.669475555419922, "region": "Mexican", "recipe": "Cheesy Taco Sloppy Joes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79294"}, {"umap_0": 0.4976460933685303, "umap_1": 8.070438385009766, "region": "Italian", "recipe": "Pat's Standard Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127380"}, {"umap_0": 11.347732543945312, "umap_1": 7.652974605560303, "region": "Irish", "recipe": "Mint Chocolate Brownies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136180"}, {"umap_0": 1.5803881883621216, "umap_1": 10.38937759399414, "region": "Caribbean", "recipe": "Slow Cooker Tropical Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8173"}, {"umap_0": 1.237654209136963, "umap_1": 6.316675186157227, "region": "Caribbean", "recipe": "Spicy Black Bean Sandwiches With Chipotle Mayonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92535"}, {"umap_0": 3.1554439067840576, "umap_1": 9.233389854431152, "region": "Chinese and Mongolian", "recipe": "Spicy Hotpot Broth (Sichuan) -- Hong Tang Lu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53201"}, {"umap_0": 1.0343331098556519, "umap_1": 5.574160099029541, "region": "Mexican", "recipe": "Mexican Jack Cheese Omelet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87188"}, {"umap_0": 1.652122974395752, "umap_1": 6.311713695526123, "region": "Mexican", "recipe": "Family Mexican Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79325"}, {"umap_0": 0.025085465982556343, "umap_1": 9.81269645690918, "region": "Japanese", "recipe": "Mexican Tortilla Chicken Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71738"}, {"umap_0": 1.4926679134368896, "umap_1": 6.970399856567383, "region": "Eastern European", "recipe": "Zelova Vomacka (Sauerkraut Gravy)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142408"}, {"umap_0": 3.6649651527404785, "umap_1": 10.051774024963379, "region": "Chinese and Mongolian", "recipe": "Another Chinese Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56547"}, {"umap_0": -2.7004785537719727, "umap_1": 7.23246431350708, "region": "Italian", "recipe": "Eggplant (Aubergine) Almost Parmigiano", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124072"}, {"umap_0": 0.025644266977906227, "umap_1": 8.840983390808105, "region": "Italian", "recipe": "Italian Tuscan Potato and Sausage Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120374"}, {"umap_0": 10.697286605834961, "umap_1": 7.963949203491211, "region": "French", "recipe": "Amazingly Delicious Chocolate Mousse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111370"}, {"umap_0": 0.8622411489486694, "umap_1": 6.402688503265381, "region": "Mexican", "recipe": "Mexican Cornbread II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82970"}, {"umap_0": 9.412566184997559, "umap_1": 9.212646484375, "region": "Canadian", "recipe": "Jello Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148065"}, {"umap_0": -1.1330090761184692, "umap_1": 10.193440437316895, "region": "Deutschland", "recipe": "German Cukes &amp; Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137511"}, {"umap_0": 9.357527732849121, "umap_1": 6.522380828857422, "region": "Italian", "recipe": "Italian Genetti Cake    1962", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125165"}, {"umap_0": 1.3735902309417725, "umap_1": 4.824747085571289, "region": "Mexican", "recipe": "Mom's Easy Enchilada Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81346"}, {"umap_0": 0.6354721784591675, "umap_1": 8.935464859008789, "region": "Canadian", "recipe": "Give Thanks Pumpkin Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146551"}, {"umap_0": 1.4770550727844238, "umap_1": 12.153374671936035, "region": "Indian Subcontinent", "recipe": "Charishma's Brown Rice and Puliogare Rice Mix", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64932"}, {"umap_0": 1.3660162687301636, "umap_1": 10.005030632019043, "region": "South American", "recipe": "Double Cooked Beef Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98818"}, {"umap_0": -1.5219566822052002, "umap_1": 11.854148864746094, "region": "Mexican", "recipe": "Shrimp Ceviche on Endive", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83367"}, {"umap_0": 8.880476951599121, "umap_1": 7.366859436035156, "region": "Canadian", "recipe": "Fresh Peach Cake (Gluten Free)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147998"}, {"umap_0": 9.366851806640625, "umap_1": 9.195884704589844, "region": "Greek", "recipe": "Perfect Cup of Greek Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110667"}, {"umap_0": -1.3214271068572998, "umap_1": 9.422991752624512, "region": "Deutschland", "recipe": "Quioa With Swiss Chard and Chick Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118442"}, {"umap_0": 0.37308236956596375, "umap_1": 8.61596393585205, "region": "Deutschland", "recipe": "Goulash Soup- German", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138722"}, {"umap_0": 2.2169711589813232, "umap_1": 11.16247844696045, "region": "Thai", "recipe": "Citrusy Tom Yum Chicken Stew With Yams &amp; Kale (Slow Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58929"}, {"umap_0": 1.4431850910186768, "umap_1": 9.236349105834961, "region": "Mexican", "recipe": "Mexican Fried Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78284"}, {"umap_0": -0.14353497326374054, "umap_1": 6.9521613121032715, "region": "Australian", "recipe": "Bubble and Squeak Slice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75660"}, {"umap_0": 11.5856351852417, "umap_1": 8.733628273010254, "region": "Italian", "recipe": "Stella Doro  Italian Rum Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120610"}, {"umap_0": 9.757328987121582, "umap_1": 6.803773403167725, "region": "US", "recipe": "Hawaiian Sweet Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16611"}, {"umap_0": 10.156984329223633, "umap_1": 6.4957122802734375, "region": "French", "recipe": "Superbly Rich Chocolate Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112719"}, {"umap_0": 10.60612964630127, "umap_1": 7.962582111358643, "region": "Spanish and Portuguese", "recipe": "Spanish Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141278"}, {"umap_0": 3.416504383087158, "umap_1": 9.423063278198242, "region": "Japanese", "recipe": "Okura and Sakura Shrimp Japanese-Style Spaghetti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5062"}, {"umap_0": -2.2897472381591797, "umap_1": 8.184566497802734, "region": "Italian", "recipe": "Simple Low-Fat Turkey Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120260"}, {"umap_0": 1.1479551792144775, "umap_1": 9.258878707885742, "region": "South American", "recipe": "Beef 'n Bean Burrito Stack (Crock Pot, Slow Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95374"}, {"umap_0": 10.10209846496582, "umap_1": 7.524328231811523, "region": "UK", "recipe": "Louise Read's Coffee-Crunch Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104370"}, {"umap_0": -1.838333010673523, "umap_1": 9.036421775817871, "region": "Middle Eastern", "recipe": "Preserved Lemons(ATK)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61208"}, {"umap_0": 10.957182884216309, "umap_1": 9.251062393188477, "region": "French", "recipe": "freedom fruit tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114699"}, {"umap_0": 2.8964154720306396, "umap_1": 10.311013221740723, "region": "Korean", "recipe": "Flash Korean Japanese Style Cucumber and Daikon Pickles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69573"}, {"umap_0": -1.9289010763168335, "umap_1": 6.805454254150391, "region": "Italian", "recipe": "Spring Primavera", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120340"}, {"umap_0": -0.52470862865448, "umap_1": 8.281837463378906, "region": "French", "recipe": "Fish Fillets in Parchment With Asparagus &amp; Orange", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115053"}, {"umap_0": 2.278409481048584, "umap_1": 11.373751640319824, "region": "Thai", "recipe": "Tom Yum Koong Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3190"}, {"umap_0": 2.523908853530884, "umap_1": 10.419310569763184, "region": "Southeast Asian", "recipe": "Indonesian Peanut Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59869"}, {"umap_0": 3.3184008598327637, "umap_1": 10.022841453552246, "region": "US", "recipe": "Sesame Chicken Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14761"}, {"umap_0": -1.1145869493484497, "umap_1": 4.25117826461792, "region": "US", "recipe": "Pepperoni Stuffed Zucchini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15553"}, {"umap_0": -0.16297447681427002, "umap_1": 8.082793235778809, "region": "French", "recipe": "Chicken Breasts With Mushroom Cream Sauce over Egg Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116923"}, {"umap_0": 10.027602195739746, "umap_1": 9.561647415161133, "region": "Canadian", "recipe": "Apple Cinnamon Infused Vodka", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146576"}, {"umap_0": 9.60172176361084, "umap_1": 6.984546661376953, "region": "Indian Subcontinent", "recipe": "West Indian Coconut Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66590"}, {"umap_0": -2.0326576232910156, "umap_1": 8.155261039733887, "region": "Greek", "recipe": "Greek Fit-For-The-Gods Salad With Spicy Cucumber Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110536"}, {"umap_0": -2.1525142192840576, "umap_1": 8.909070014953613, "region": "French", "recipe": "Scallops With Seaweed Butter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115524"}, {"umap_0": 8.234896659851074, "umap_1": 6.744616985321045, "region": "Spanish and Portuguese", "recipe": "Portuguese Doughnuts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118756"}, {"umap_0": -1.0125579833984375, "umap_1": 4.652006149291992, "region": "Italian", "recipe": "Spaghetti Torte", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131940"}, {"umap_0": 0.3390287160873413, "umap_1": 9.763676643371582, "region": "Middle Eastern", "recipe": "Persian Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60046"}, {"umap_0": 3.8986332416534424, "umap_1": 9.398303985595703, "region": "Japanese", "recipe": "Reconstituted Shitake Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71034"}, {"umap_0": 1.6634480953216553, "umap_1": 8.457387924194336, "region": "South American", "recipe": "Tender Roast from Economical Cuts of Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94444"}, {"umap_0": 11.147115707397461, "umap_1": 5.5801167488098145, "region": "UK", "recipe": "Gluten Free Melt-In-Your-Mouth Shortbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101540"}, {"umap_0": 10.615638732910156, "umap_1": 9.423094749450684, "region": "Australian", "recipe": "Boysenberry Kiwi Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76789"}, {"umap_0": 1.1105831861495972, "umap_1": 5.7901082038879395, "region": "Irish", "recipe": "Potatoes, Peppers, and Beer", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136496"}, {"umap_0": 1.6699894666671753, "umap_1": 10.233530044555664, "region": "Chinese and Mongolian", "recipe": "Cantonese Meatballs (Appetizers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51995"}, {"umap_0": -1.6061029434204102, "umap_1": 5.809832572937012, "region": "Italian", "recipe": "Pizza Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124814"}, {"umap_0": 0.43345507979393005, "umap_1": 10.59730339050293, "region": "Mexican", "recipe": "Baja Fish Wraps With Chipotle-Lime Slaw", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84865"}, {"umap_0": 9.462594032287598, "umap_1": 5.876317024230957, "region": "Rest Africa", "recipe": "North Africian Cardamom Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49008"}, {"umap_0": 3.5356295108795166, "umap_1": 9.975740432739258, "region": "Chinese and Mongolian", "recipe": "Vegetarian Bean Sauce Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56500"}, {"umap_0": -0.7860097289085388, "umap_1": 11.373174667358398, "region": "Eastern European", "recipe": "Belarussian Carrot Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100784"}, {"umap_0": -1.3339581489562988, "umap_1": 9.327593803405762, "region": "Canadian", "recipe": "Spinach with Lentils", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148944"}, {"umap_0": 9.967204093933105, "umap_1": 7.210781574249268, "region": "Scandinavian", "recipe": "Swedish Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9129"}, {"umap_0": 3.366657257080078, "umap_1": 9.275442123413086, "region": "Japanese", "recipe": "Chicken Hekka", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5083"}, {"umap_0": 0.8274092078208923, "umap_1": 8.276862144470215, "region": "South American", "recipe": "Brandied Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99647"}, {"umap_0": 1.087394118309021, "umap_1": 11.729743003845215, "region": "Indian Subcontinent", "recipe": "Kali Mirch Murg (Chicken Curry With Black Pepper)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64743"}, {"umap_0": -1.6914072036743164, "umap_1": 9.18214225769043, "region": "Eastern European", "recipe": "Basturma (russian Marinated Skewered Beef)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100510"}, {"umap_0": 0.9825918078422546, "umap_1": 10.085289001464844, "region": "US", "recipe": "Down-South Style Green Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16128"}, {"umap_0": -0.2914678156375885, "umap_1": 7.806410789489746, "region": "Canadian", "recipe": "Cheeseburger Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145147"}, {"umap_0": -1.6939055919647217, "umap_1": 9.070364952087402, "region": "Australian", "recipe": "Pasta With Leeks and Parsley", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72937"}, {"umap_0": 9.658759117126465, "umap_1": 5.5048627853393555, "region": "UK", "recipe": "Brown Sugar Shortbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101438"}, {"umap_0": 1.2865381240844727, "umap_1": 7.969576835632324, "region": "Canadian", "recipe": "Susan's Best Cabbage Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147297"}, {"umap_0": 1.439955234527588, "umap_1": 5.958235740661621, "region": "Mexican", "recipe": "Elise's Quesadilla Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80657"}, {"umap_0": 2.4584014415740967, "umap_1": 5.440091133117676, "region": "South American", "recipe": "Low Cal  Beef Taco Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99840"}, {"umap_0": 1.5317089557647705, "umap_1": 5.783588409423828, "region": "Mexican", "recipe": "Chimichangas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86813"}, {"umap_0": 10.646544456481934, "umap_1": 8.213096618652344, "region": "Mexican", "recipe": "Mexican Bread Pudding Capidotada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83221"}, {"umap_0": -0.8091127872467041, "umap_1": 5.588508129119873, "region": "Italian", "recipe": "Alfredo Sauce (Of Death)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127508"}, {"umap_0": 0.225128635764122, "umap_1": 12.402318000793457, "region": "Indian Subcontinent", "recipe": "Low Fat Aloo Gobi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68531"}, {"umap_0": -1.2968248128890991, "umap_1": 4.513106346130371, "region": "French", "recipe": "Meaty French Bread Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114503"}, {"umap_0": -2.5366716384887695, "umap_1": 8.779463768005371, "region": "Italian", "recipe": "Bread Salad With Fresh Tomatoes (panzanella)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129737"}, {"umap_0": 10.623146057128906, "umap_1": 7.5068488121032715, "region": "Canadian", "recipe": "Real Canadian Butter Tarts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143510"}, {"umap_0": -2.386587381362915, "umap_1": 8.732303619384766, "region": "Australian", "recipe": "Cauliflower, Anchovy and Olive salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75638"}, {"umap_0": 3.113199472427368, "umap_1": 10.272427558898926, "region": "Australian", "recipe": "Marinated Prawns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74652"}, {"umap_0": 10.777298927307129, "umap_1": 8.428910255432129, "region": "French", "recipe": "French Toast With Pineapple-Fig Flambe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113306"}, {"umap_0": 7.68690824508667, "umap_1": 6.4767351150512695, "region": "Spanish and Portuguese", "recipe": "Ojaldas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140733"}, {"umap_0": 1.9333122968673706, "umap_1": 10.470117568969727, "region": "Australian", "recipe": "Chicken With Cranberries and Orange", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75169"}, {"umap_0": -0.4192400574684143, "umap_1": 6.965161323547363, "region": "Canadian", "recipe": "Salmon Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148789"}, {"umap_0": -2.6806905269622803, "umap_1": 9.345675468444824, "region": "Greek", "recipe": "Granny's Greek Fried Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110992"}, {"umap_0": -0.20145563781261444, "umap_1": 10.386892318725586, "region": "Chinese and Mongolian", "recipe": "Spicy Orange Beef Kabobs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56952"}, {"umap_0": 0.10434052348136902, "umap_1": 8.132986068725586, "region": "South American", "recipe": "Sandaidh's Crock Pot Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99856"}, {"umap_0": 2.9792513847351074, "umap_1": 9.98609447479248, "region": "Chinese and Mongolian", "recipe": "Cream Cheese, Chinese Garlic-Hot Sauce, &amp; Kim Chee Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52425"}, {"umap_0": 11.401833534240723, "umap_1": 9.952619552612305, "region": "Rest Africa", "recipe": "Fruit and Nuts With Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51173"}, {"umap_0": -1.1528853178024292, "umap_1": 4.63319730758667, "region": "Italian", "recipe": "Delicate Cheese Ravioli Ala Filomena (And Baked Version, Too!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132849"}, {"umap_0": -0.07309368252754211, "umap_1": 11.330818176269531, "region": "South American", "recipe": "Beef Brisket With Oranges", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96549"}, {"umap_0": 11.925108909606934, "umap_1": 8.318337440490723, "region": "Eastern European", "recipe": "Sliwka Naleczowska W Czekoladzie -- Polish Chocolate Plums", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133650"}, {"umap_0": 1.4653419256210327, "umap_1": 9.624805450439453, "region": "Spanish and Portuguese", "recipe": "Memory Lane Spanish Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140359"}, {"umap_0": 3.3458211421966553, "umap_1": 10.333921432495117, "region": "Chinese and Mongolian", "recipe": "Beef &amp; Cashew Stir Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52853"}, {"umap_0": -0.9238266348838806, "umap_1": 6.956496715545654, "region": "Australian", "recipe": "Salmon &amp; Leek Filo Parcels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76031"}, {"umap_0": 10.662707328796387, "umap_1": 10.810108184814453, "region": "Mexican", "recipe": "Watermelon-Basil Margaritas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82476"}, {"umap_0": 0.18920071423053741, "umap_1": 5.791487693786621, "region": "Deutschland", "recipe": "Creamy Swiss &amp; Ham Stuffed Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117728"}, {"umap_0": -1.0673952102661133, "umap_1": 9.55289077758789, "region": "Mexican", "recipe": "Cancun Pescado Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87770"}, {"umap_0": 1.3067034482955933, "umap_1": 4.882326126098633, "region": "Mexican", "recipe": "Mom's Mexican Chicken (Jennifer Sue Jacks Henley)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6661"}, {"umap_0": 9.685789108276367, "umap_1": 8.276434898376465, "region": "Canadian", "recipe": "Heisenberg\u2019s Blue Rock \u201cMeth\u201d Candy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146079"}, {"umap_0": 0.756194531917572, "umap_1": 9.154415130615234, "region": "Deutschland", "recipe": "Apricot Curried Rice with Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137196"}, {"umap_0": -1.0771478414535522, "umap_1": 9.509602546691895, "region": "US", "recipe": "Easy Roast Leg of Lamb", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16521"}, {"umap_0": -1.094968318939209, "umap_1": 4.1139020919799805, "region": "Italian", "recipe": "Spinach &amp; Ravioli Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129917"}, {"umap_0": 2.24851655960083, "umap_1": 7.521053791046143, "region": "US", "recipe": "Mom's Texas Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17565"}, {"umap_0": -0.38590046763420105, "umap_1": 8.061714172363281, "region": "Canadian", "recipe": "Southern Pecan crusted Chicken with Mustard sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146719"}, {"umap_0": 0.3166462182998657, "umap_1": 9.926994323730469, "region": "Mexican", "recipe": "Fish Tacos - Baja Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84042"}, {"umap_0": 8.933710098266602, "umap_1": 6.923816204071045, "region": "French", "recipe": "Cherry Crepes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114439"}, {"umap_0": 1.037848949432373, "umap_1": 5.429193019866943, "region": "Mexican", "recipe": "Monterey Ole' Mexican Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84613"}, {"umap_0": 11.82238483428955, "umap_1": 7.95355224609375, "region": "Deutschland", "recipe": "German Chocolate Cream Cheese Brownies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137649"}, {"umap_0": 2.0246706008911133, "umap_1": 6.24749231338501, "region": "Mexican", "recipe": "Fire-Roasted Corn and Poblano Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87477"}, {"umap_0": -1.6825156211853027, "umap_1": 10.868810653686523, "region": "Northern Africa", "recipe": "Roasted Carrots -Memories of Morocco (Ww Core)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49537"}, {"umap_0": 0.9019862413406372, "umap_1": 7.4742655754089355, "region": "Deutschland", "recipe": "Tyrolean Bacon-Dumpling Soup (Speckknodelsuppe)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107418"}, {"umap_0": 1.742106318473816, "umap_1": 10.877889633178711, "region": "Indian Subcontinent", "recipe": "Curried Sweet Potatoes With Green Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64254"}, {"umap_0": 1.005946159362793, "umap_1": 9.449018478393555, "region": "Caribbean", "recipe": "Jamaican Beef Patties (Beef in Pastry)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90703"}, {"umap_0": 0.5261460542678833, "umap_1": 10.313401222229004, "region": "Irish", "recipe": "Ancho Beer Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134708"}, {"umap_0": 9.465495109558105, "umap_1": 8.637487411499023, "region": "Greek", "recipe": "Greek Pasteli (Sesame Seed &amp; Honey Snacks)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109056"}, {"umap_0": -1.575779676437378, "umap_1": 9.492369651794434, "region": "Middle Eastern", "recipe": "Garlic Rice with Pine Nuts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61731"}, {"umap_0": 10.504138946533203, "umap_1": 8.11951732635498, "region": "Deutschland", "recipe": "Apple Strudel Two Ways", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137673"}, {"umap_0": 0.4395843744277954, "umap_1": 8.232477188110352, "region": "South American", "recipe": "Favorite Beef Vegetable Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98710"}, {"umap_0": -0.8889778256416321, "umap_1": 4.557959079742432, "region": "Italian", "recipe": "Italian Omelette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123885"}, {"umap_0": -1.4320403337478638, "umap_1": 8.87225341796875, "region": "French", "recipe": "Bistro Bouillabaisse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116716"}, {"umap_0": 0.4621977210044861, "umap_1": 5.162688255310059, "region": "Irish", "recipe": "Full Ulster Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135213"}, {"umap_0": 0.31085625290870667, "umap_1": 8.173240661621094, "region": "Australian", "recipe": "Salt and Pepper Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73573"}, {"umap_0": 9.681034088134766, "umap_1": 5.524683475494385, "region": "Irish", "recipe": "Herb Irish Soda Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135927"}, {"umap_0": 0.5853262543678284, "umap_1": 4.925212860107422, "region": "Australian", "recipe": "Cheese and Leek Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74478"}, {"umap_0": 2.3423733711242676, "umap_1": 9.730361938476562, "region": "South American", "recipe": "Beef With Cabbage and Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98989"}, {"umap_0": 11.362954139709473, "umap_1": 10.082721710205078, "region": "Irish", "recipe": "Tipperary", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134675"}, {"umap_0": 2.0764663219451904, "umap_1": 5.409379482269287, "region": "Mexican", "recipe": "Vegetarian Nachos Especial", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87135"}, {"umap_0": -0.14368288218975067, "umap_1": 5.6611151695251465, "region": "Chinese and Mongolian", "recipe": "Anne-Marie's Budget Saving Cheesie Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53689"}, {"umap_0": 11.728506088256836, "umap_1": 8.229470252990723, "region": "Canadian", "recipe": "Cheesecake Supreme", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143316"}, {"umap_0": 3.080148696899414, "umap_1": 9.291823387145996, "region": "UK", "recipe": "Beef, Guinness and Mushroom Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102578"}, {"umap_0": -1.6427479982376099, "umap_1": 10.05509090423584, "region": "Northern Africa", "recipe": "Moroccan Spiced Olives", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49677"}, {"umap_0": 0.8318037390708923, "umap_1": 7.997498512268066, "region": "Deutschland", "recipe": "Joe's Swiss Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117526"}, {"umap_0": 0.2965645492076874, "umap_1": 5.572545528411865, "region": "Deutschland", "recipe": "Ham and Swiss in a Bun", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118133"}, {"umap_0": 0.19851399958133698, "umap_1": 12.055628776550293, "region": "Indian Subcontinent", "recipe": "Long Beans With Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67135"}, {"umap_0": 1.0551636219024658, "umap_1": 8.346288681030273, "region": "South American", "recipe": "Potted Stuffed Beef Heart and Madeira Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94815"}, {"umap_0": 11.078954696655273, "umap_1": 7.712956428527832, "region": "Spanish and Portuguese", "recipe": "Spanish Spiced Hot Chocolate", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141503"}, {"umap_0": 1.8542250394821167, "umap_1": 6.653536796569824, "region": "Mexican", "recipe": "The Ospidillo Cafe Chilies Rellenos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80807"}, {"umap_0": -1.3142787218093872, "umap_1": 8.45578384399414, "region": "French", "recipe": "Veal Medallions With French Morels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113561"}, {"umap_0": -1.462088704109192, "umap_1": 8.64195728302002, "region": "US", "recipe": "Sloppy Joe with Ground Turkey", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15744"}, {"umap_0": 2.9857001304626465, "umap_1": 6.698624610900879, "region": "UK", "recipe": "Marmite and Cheese Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102443"}, {"umap_0": 1.0651118755340576, "umap_1": 12.153023719787598, "region": "Indian Subcontinent", "recipe": "Vegetable Biryani", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4018"}, {"umap_0": -1.4966187477111816, "umap_1": 12.058318138122559, "region": "Mexican", "recipe": "Super Fabulous Guacamole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85691"}, {"umap_0": -1.5947813987731934, "umap_1": 8.943644523620605, "region": "Italian", "recipe": "Olive Garden Pollo Limone (Lemon Chicken)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122863"}, {"umap_0": 10.104763984680176, "umap_1": 6.601298809051514, "region": "Italian", "recipe": "Chocolate Citrus Biscotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120967"}, {"umap_0": -2.2293882369995117, "umap_1": 9.31502628326416, "region": "South American", "recipe": "Chimichurri Muy Pefecto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8507"}, {"umap_0": -1.3183051347732544, "umap_1": 10.387813568115234, "region": "Greek", "recipe": "Lentils Esau", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109245"}, {"umap_0": 0.051858969032764435, "umap_1": 6.969412803649902, "region": "Canadian", "recipe": "Holiday Pecan Stuffing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142646"}, {"umap_0": 1.5572264194488525, "umap_1": 5.45778751373291, "region": "Mexican", "recipe": "Alpo Dip!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79243"}, {"umap_0": 2.7580223083496094, "umap_1": 10.663866996765137, "region": "Rest Africa", "recipe": "Congo Tofu (Or Chicken)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48628"}, {"umap_0": -0.2753370404243469, "umap_1": 6.4594926834106445, "region": "Spanish and Portuguese", "recipe": "Double Decker Tomato Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140916"}, {"umap_0": -2.074056625366211, "umap_1": 6.940325736999512, "region": "Italian", "recipe": "Berry Bruschetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132207"}, {"umap_0": 0.08646221458911896, "umap_1": 8.49849796295166, "region": "Italian", "recipe": "Italian Dressing Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125058"}, {"umap_0": -2.5336055755615234, "umap_1": 9.157170295715332, "region": "Greek", "recipe": "Skorthalia (Greek Garlic Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109394"}, {"umap_0": -1.004734992980957, "umap_1": 9.219395637512207, "region": "French", "recipe": "Fish Soup Provencale", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112973"}, {"umap_0": 2.7159018516540527, "umap_1": 8.857605934143066, "region": "Japanese", "recipe": "Avocado Sushi with Brown Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5118"}, {"umap_0": 10.541696548461914, "umap_1": 7.53225564956665, "region": "Italian", "recipe": "Have It Your Way Biscotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120803"}, {"umap_0": 2.7782957553863525, "umap_1": 10.272409439086914, "region": "Southeast Asian", "recipe": "Adobo Style Shrimp (Filipino)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63692"}, {"umap_0": 10.56124210357666, "umap_1": 10.568076133728027, "region": "Mexican", "recipe": "Perro Salado (Tequila Salty Dog)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81265"}, {"umap_0": -1.3881385326385498, "umap_1": 7.135460376739502, "region": "Italian", "recipe": "Cheat's Easy Salmon Risotto (Rice Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125824"}, {"umap_0": -0.8828059434890747, "umap_1": 5.964145183563232, "region": "Scandinavian", "recipe": "Artichoke-Red Pepper Pesto Spread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139781"}, {"umap_0": 0.9058073163032532, "umap_1": 5.742369174957275, "region": "Mexican", "recipe": "Mexican Chicken Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88682"}, {"umap_0": 2.0204532146453857, "umap_1": 6.9913482666015625, "region": "Mexican", "recipe": "Easy Tamale Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83037"}, {"umap_0": -1.793311595916748, "umap_1": 8.244132041931152, "region": "US", "recipe": "Southern Turnip Greens with Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16160"}, {"umap_0": -1.4402920007705688, "umap_1": 12.00793743133545, "region": "Mexican", "recipe": "Los Salsa Roja de Grackles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87851"}, {"umap_0": 1.8460345268249512, "umap_1": 7.46370792388916, "region": "Japanese", "recipe": "Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71497"}, {"umap_0": 3.2214860916137695, "umap_1": 9.551199913024902, "region": "Japanese", "recipe": "Izakaya Sakura Five Spice Soba", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71902"}, {"umap_0": 0.057061370462179184, "umap_1": 8.298545837402344, "region": "South American", "recipe": "Low Fat Wok Strogonof", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98141"}, {"umap_0": 10.255284309387207, "umap_1": 10.028037071228027, "region": "US", "recipe": "Virgin Strawberry Daiquiri", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16080"}, {"umap_0": 2.56575870513916, "umap_1": 10.922219276428223, "region": "South American", "recipe": "Sweet beef curry with corn and peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99522"}, {"umap_0": -0.36123520135879517, "umap_1": 9.023611068725586, "region": "Spanish and Portuguese", "recipe": "Arroz De Cabidela", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119185"}, {"umap_0": -0.7476745247840881, "umap_1": 8.673479080200195, "region": "US", "recipe": "Original Blue Cheese Coleslaw", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17603"}, {"umap_0": -1.8240654468536377, "umap_1": 8.180423736572266, "region": "Italian", "recipe": "Italian Tuna, Oven-Roasted Tomato, and Arugula Panini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130315"}, {"umap_0": 10.154010772705078, "umap_1": 7.700729846954346, "region": "Italian", "recipe": "Cannoli III", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11127"}, {"umap_0": 10.479533195495605, "umap_1": 8.134147644042969, "region": "Caribbean", "recipe": "Caribbean Key Lime Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90104"}, {"umap_0": 2.5467443466186523, "umap_1": 10.30854606628418, "region": "US", "recipe": "Sharkey's Barbequed Trout", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14146"}, {"umap_0": -2.552872896194458, "umap_1": 8.35036849975586, "region": "Italian", "recipe": "Easy Uncooked Tomato Sauce for Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126605"}, {"umap_0": 1.908292293548584, "umap_1": 7.644710540771484, "region": "Australian", "recipe": "Lamb Haters Crumbed Cutlets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75985"}, {"umap_0": -2.313588857650757, "umap_1": 10.131288528442383, "region": "Canadian", "recipe": "Spinach, Mandarin &amp; Cashew Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146450"}, {"umap_0": -1.7652684450149536, "umap_1": 7.72715425491333, "region": "Italian", "recipe": "Garlic Shrimp and Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130743"}, {"umap_0": -1.013270616531372, "umap_1": 8.606904029846191, "region": "Irish", "recipe": "Gorgeous Vegetarian Rice Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136462"}, {"umap_0": 2.834089994430542, "umap_1": 9.519465446472168, "region": "US", "recipe": "Garlic-Ginger Chicken Wings", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17777"}, {"umap_0": 0.09859195351600647, "umap_1": 11.320109367370605, "region": "South American", "recipe": "Chuckwagon Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95600"}, {"umap_0": 0.3424185514450073, "umap_1": 8.87253189086914, "region": "Central American", "recipe": "Poyha (Old Native American Recipe) Meat &amp; Cormeal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93001"}, {"umap_0": 0.7457483410835266, "umap_1": 10.03559398651123, "region": "Mexican", "recipe": "Fab-u-lous Refried Beans!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6175"}, {"umap_0": 1.8115586042404175, "umap_1": 7.452395439147949, "region": "Eastern European", "recipe": "Chicken Paprikash With Spaetzle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105547"}, {"umap_0": -1.4686909914016724, "umap_1": 8.575615882873535, "region": "Middle Eastern", "recipe": "Authentic Lebanese Fattoush", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3818"}, {"umap_0": 7.602921485900879, "umap_1": 7.916861057281494, "region": "Japanese", "recipe": "Make-Ahead Sushi Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71172"}, {"umap_0": 2.607903003692627, "umap_1": 10.810440063476562, "region": "Thai", "recipe": "Slow Cooker Thai Pork With Peanut Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58172"}, {"umap_0": 3.679088592529297, "umap_1": 9.638720512390137, "region": "Chinese and Mongolian", "recipe": "Beef With Broccoli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55821"}, {"umap_0": 10.81221866607666, "umap_1": 7.724643230438232, "region": "Deutschland", "recipe": "Chocolate Cheese Fondue", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118572"}, {"umap_0": 3.12463116645813, "umap_1": 7.510646820068359, "region": "Italian", "recipe": "Focaccia Buns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124056"}, {"umap_0": 2.524719715118408, "umap_1": 10.462780952453613, "region": "Mexican", "recipe": "Smashed Mexican Black Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78044"}, {"umap_0": 3.2224416732788086, "umap_1": 10.0649995803833, "region": "Chinese and Mongolian", "recipe": "Cantonese Roast Duck", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55276"}, {"umap_0": 2.668879747390747, "umap_1": 10.186763763427734, "region": "Chinese and Mongolian", "recipe": "Chicken, Cashews and Red Pepper Stir-Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56201"}, {"umap_0": -1.5967803001403809, "umap_1": 8.705949783325195, "region": "Northern Africa", "recipe": "Fish Tagine With Tomatoes, Capers, and Cinnamon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49753"}, {"umap_0": 10.841547012329102, "umap_1": 6.848720550537109, "region": "UK", "recipe": "Easy Mexican Hot Chocolate", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8887"}, {"umap_0": 10.348551750183105, "umap_1": 7.828114986419678, "region": "Italian", "recipe": "Spumetti (Chocolate-Hazelnut Meringues)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120851"}, {"umap_0": 11.177433013916016, "umap_1": 7.558730125427246, "region": "Australian", "recipe": "Dessert Jaffles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74634"}, {"umap_0": 11.351784706115723, "umap_1": 8.262283325195312, "region": "Italian", "recipe": "Italian Rice Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121492"}, {"umap_0": 0.8606560826301575, "umap_1": 5.519354820251465, "region": "US", "recipe": "Louisiana Shrimp Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16983"}, {"umap_0": 10.583642959594727, "umap_1": 7.005478858947754, "region": "Australian", "recipe": "Easy Chocolate Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73963"}, {"umap_0": 0.2288971096277237, "umap_1": 5.716791152954102, "region": "Caribbean", "recipe": "Trinidad Pepper Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91522"}, {"umap_0": -1.0484470129013062, "umap_1": 9.63629150390625, "region": "Middle Eastern", "recipe": "Middle Eastern Beef Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62952"}, {"umap_0": 10.229118347167969, "umap_1": 7.130414009094238, "region": "Australian", "recipe": "Hot Cross Buns I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5368"}, {"umap_0": 9.117935180664062, "umap_1": 7.722469806671143, "region": "Japanese", "recipe": "Apricot Mochi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70999"}, {"umap_0": 0.8308312296867371, "umap_1": 10.961265563964844, "region": "Mexican", "recipe": "Low Carb Mexican \"rice\"", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77998"}, {"umap_0": 10.630115509033203, "umap_1": 7.235660076141357, "region": "Chinese and Mongolian", "recipe": "Rice Pudding (For One)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53455"}, {"umap_0": 2.0343143939971924, "umap_1": 5.875593185424805, "region": "Spanish and Portuguese", "recipe": "Roti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141768"}, {"umap_0": 1.26576828956604, "umap_1": 11.896153450012207, "region": "Indian Subcontinent", "recipe": "The New Exotic Marigold Curry Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65176"}, {"umap_0": -1.100703477859497, "umap_1": 7.347123146057129, "region": "Irish", "recipe": "Brussels Sprouts wth Pecans, Basil and Garlic", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135767"}, {"umap_0": -0.2541266083717346, "umap_1": 10.549250602722168, "region": "Mexican", "recipe": "Frijoles Rancheros", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81632"}, {"umap_0": 0.041061025112867355, "umap_1": 10.655411720275879, "region": "Indian Subcontinent", "recipe": "Bengali Dhal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3711"}, {"umap_0": 2.986401081085205, "umap_1": 9.628595352172852, "region": "Chinese and Mongolian", "recipe": "Chinese Five Spice Salmon With Ginger Broth", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55266"}, {"umap_0": 10.449230194091797, "umap_1": 9.624399185180664, "region": "Rest Africa", "recipe": "Papaya Candy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49316"}, {"umap_0": -1.1819052696228027, "umap_1": 9.288045883178711, "region": "Spanish and Portuguese", "recipe": "Traditional Garlic Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118998"}, {"umap_0": 9.8220796585083, "umap_1": 9.752531051635742, "region": "Rest Africa", "recipe": "Tej (Ethiopian Honey Wine)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51512"}, {"umap_0": 7.734805583953857, "umap_1": 7.846033096313477, "region": "Chinese and Mongolian", "recipe": "Helen's Cheese Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54520"}, {"umap_0": 3.1828866004943848, "umap_1": 9.53073501586914, "region": "Chinese and Mongolian", "recipe": "Noble House Beef Tomato Chow Mein", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56985"}, {"umap_0": -0.10581226646900177, "umap_1": 8.19990062713623, "region": "UK", "recipe": "Bubble and Squeak Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102609"}, {"umap_0": 2.2212445735931396, "umap_1": 8.82559871673584, "region": "Australian", "recipe": "Eira's Australian Mayonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75289"}, {"umap_0": 0.9013664722442627, "umap_1": 7.409241199493408, "region": "South American", "recipe": "Beef Rissoles (Kind of Like Hamburgers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95751"}, {"umap_0": 2.173975944519043, "umap_1": 7.53801965713501, "region": "Rest Africa", "recipe": "Potato Stuffed With Egg (Nafaqo)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47508"}, {"umap_0": 9.074942588806152, "umap_1": 9.018590927124023, "region": "Indian Subcontinent", "recipe": "Spicy Chai Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67728"}, {"umap_0": 10.422380447387695, "umap_1": 5.937242031097412, "region": "Greek", "recipe": "Kourabiedes (Greek Butter Cookies)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107846"}, {"umap_0": 0.8908785581588745, "umap_1": 10.590702056884766, "region": "Indian Subcontinent", "recipe": "Fried Pineapple Rice(Vegan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64804"}, {"umap_0": -1.793685793876648, "umap_1": 8.096028327941895, "region": "French", "recipe": "Pecan-Stuffed Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114945"}, {"umap_0": 3.816594123840332, "umap_1": 10.079052925109863, "region": "Canadian", "recipe": "Sesame Green Beans &amp; Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143645"}, {"umap_0": 1.8684269189834595, "umap_1": 7.361056804656982, "region": "South American", "recipe": "Savory Beef and Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94767"}, {"umap_0": 1.890509843826294, "umap_1": 8.794905662536621, "region": "Spanish and Portuguese", "recipe": "Authentic Spanish Rice--From Famed De Grazia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141990"}, {"umap_0": 1.9873422384262085, "umap_1": 5.860854625701904, "region": "Mexican", "recipe": "Potato Rolled Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78600"}, {"umap_0": 10.642098426818848, "umap_1": 8.38407039642334, "region": "Canadian", "recipe": "Nettie Cronish's Chocolate Tofu Cheesecake for Nonbelievers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148992"}, {"umap_0": -0.5446988344192505, "umap_1": 10.437939643859863, "region": "Spanish and Portuguese", "recipe": "Lobster and Scallop Ceviche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140254"}, {"umap_0": -1.2126994132995605, "umap_1": 4.494453430175781, "region": "Italian", "recipe": "Bruschetta for One or Two", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132854"}, {"umap_0": 10.421103477478027, "umap_1": 10.151790618896484, "region": "UK", "recipe": "Pimm's With Lemonade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101429"}, {"umap_0": 1.2626316547393799, "umap_1": 12.198409080505371, "region": "Indian Subcontinent", "recipe": "One Dish Meal: Carrot and Tomato Pulao", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68159"}, {"umap_0": 9.729538917541504, "umap_1": 5.6272172927856445, "region": "Canadian", "recipe": "Maple Grandfathers (Grands-Peres a L\u2019erable)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143062"}, {"umap_0": 10.417007446289062, "umap_1": 10.08156681060791, "region": "Australian", "recipe": "Lychee Lime Sparkler", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74145"}, {"umap_0": 0.23629122972488403, "umap_1": 13.682601928710938, "region": "Indian Subcontinent", "recipe": "Curry Powder - Jaffrey", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67123"}, {"umap_0": 1.7171356678009033, "umap_1": 10.082113265991211, "region": "Caribbean", "recipe": "Caribbean Chicken With Black Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91036"}, {"umap_0": 2.4276649951934814, "umap_1": 8.936891555786133, "region": "Thai", "recipe": "Thai Beef With Broccoli in Oyster Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57561"}, {"umap_0": -1.6275815963745117, "umap_1": 8.326469421386719, "region": "Scandinavian", "recipe": "Danish Blue Cheese Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101027"}, {"umap_0": -1.5368634462356567, "umap_1": 9.103995323181152, "region": "Deutschland", "recipe": "Potato and Parsley Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139022"}, {"umap_0": 0.023742470890283585, "umap_1": 5.643985271453857, "region": "Canadian", "recipe": "Gourmet Potato Skins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144027"}, {"umap_0": 9.981390953063965, "umap_1": 9.217269897460938, "region": "Northern Africa", "recipe": "Moroccan Spiced Oranges", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50927"}, {"umap_0": 2.500869035720825, "umap_1": 6.922139644622803, "region": "Irish", "recipe": "St Patrick's Day Muffins (Savoury Potato Muffins)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136401"}, {"umap_0": 0.5113669037818909, "umap_1": 10.425747871398926, "region": "Mexican", "recipe": "Spicy Shredded Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81207"}, {"umap_0": -1.8251659870147705, "umap_1": 8.923802375793457, "region": "UK", "recipe": "Easy One Tray  Roasted Onions and Root Vegetables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104652"}, {"umap_0": 10.66236686706543, "umap_1": 9.936712265014648, "region": "Caribbean", "recipe": "Charle's Caribbean", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90065"}, {"umap_0": 10.87686824798584, "umap_1": 8.276996612548828, "region": "Rest Africa", "recipe": "Banana &amp; Date Dessert", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49057"}, {"umap_0": 1.2176278829574585, "umap_1": 12.627145767211914, "region": "Indian Subcontinent", "recipe": "Simple Microwave Vegetable Pulao", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66305"}, {"umap_0": -1.8016960620880127, "umap_1": 5.661343574523926, "region": "Italian", "recipe": "Grilled Garden Vegetable Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129354"}, {"umap_0": -0.481958270072937, "umap_1": 12.602921485900879, "region": "Caribbean", "recipe": "Jamaican Jerk Dry Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90764"}, {"umap_0": 9.189184188842773, "umap_1": 7.048974514007568, "region": "Scandinavian", "recipe": "Swedish Apple Yeast  Bread-Betty Crocker Recipe Cards", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106270"}, {"umap_0": -2.47800874710083, "umap_1": 8.358266830444336, "region": "Greek", "recipe": "Greek Salad by Ina Garten (Barefoot Contessa)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107657"}, {"umap_0": 0.019511450082063675, "umap_1": 10.462817192077637, "region": "Indian Subcontinent", "recipe": "Twisted Beef Koftas (Middle-Eastern Meatballs)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66126"}, {"umap_0": 10.196945190429688, "umap_1": 6.3405303955078125, "region": "Spanish and Portuguese", "recipe": "Sponge Cake (Portuguese Pao De Lo)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118818"}, {"umap_0": 0.25186988711357117, "umap_1": 10.787273406982422, "region": "Middle Eastern", "recipe": "Dopiazeh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61272"}, {"umap_0": -1.9276808500289917, "umap_1": 8.147125244140625, "region": "Greek", "recipe": "Greek-Style Spaghetti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110587"}, {"umap_0": 11.401742935180664, "umap_1": 6.709829807281494, "region": "Deutschland", "recipe": "Mary's German Chocolate Cake Icing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137602"}, {"umap_0": -1.5942800045013428, "umap_1": 10.639989852905273, "region": "Middle Eastern", "recipe": "Turkish Spiced Tomato Soup (Soguk Domates Corbasi)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69984"}, {"umap_0": 10.645520210266113, "umap_1": 10.292980194091797, "region": "Spanish and Portuguese", "recipe": "Crazy Huey Martini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118678"}, {"umap_0": -2.68239688873291, "umap_1": 7.819113731384277, "region": "Greek", "recipe": "Grecian Grilled Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110785"}, {"umap_0": 3.7801709175109863, "umap_1": 10.404820442199707, "region": "Korean", "recipe": "The Kim Family Korean Beef Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69133"}, {"umap_0": -1.2185335159301758, "umap_1": 9.273280143737793, "region": "Spanish and Portuguese", "recipe": "Spanish Spiked Gazpacho in Cucumber Cups", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141684"}, {"umap_0": -0.8615046143531799, "umap_1": 10.662614822387695, "region": "Mexican", "recipe": "Courgettes (Zucchini) With Cheese and Green Chillies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78108"}, {"umap_0": -0.5264487862586975, "umap_1": 7.356823444366455, "region": "Italian", "recipe": "Hazelnut Fig Twists", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130746"}, {"umap_0": 8.925804138183594, "umap_1": 6.206500053405762, "region": "UK", "recipe": "Cleaned-Up Scottish Oatcakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101386"}, {"umap_0": -0.9624252915382385, "umap_1": 7.129757881164551, "region": "Italian", "recipe": "Broccoli Lasagna Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124188"}, {"umap_0": 9.628345489501953, "umap_1": 9.630969047546387, "region": "Mexican", "recipe": "Hibiscus Punch", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80752"}, {"umap_0": -2.0603652000427246, "umap_1": 9.247681617736816, "region": "Rest Africa", "recipe": "Moroccan Bread Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48122"}, {"umap_0": 2.4109160900115967, "umap_1": 8.286372184753418, "region": "Deutschland", "recipe": "The Mystery Chef's Swiss Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117938"}, {"umap_0": 10.97363567352295, "umap_1": 5.934720516204834, "region": "Mexican", "recipe": "Mexican Hot Chocolate Balls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84584"}, {"umap_0": -0.577545166015625, "umap_1": 8.152962684631348, "region": "Italian", "recipe": "Judy's Hearty Vegetable Minestrone Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10261"}, {"umap_0": 1.5284250974655151, "umap_1": 9.540640830993652, "region": "Indian Subcontinent", "recipe": "Upma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67081"}, {"umap_0": 0.9554046392440796, "umap_1": 6.8464226722717285, "region": "French", "recipe": "French Onion Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114902"}, {"umap_0": -0.20772495865821838, "umap_1": 11.543726921081543, "region": "Indian Subcontinent", "recipe": "Chickpeas With Eggplant (Aubergine)  and Spinach", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65494"}, {"umap_0": 2.403017282485962, "umap_1": 5.687339782714844, "region": "Mexican", "recipe": "Stuffed Clams Roasted in the Coals", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81408"}, {"umap_0": -0.9155511856079102, "umap_1": 8.826055526733398, "region": "Spanish and Portuguese", "recipe": "Pork Loin With Mushroom (Lomo Con Champinones)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141234"}, {"umap_0": 9.210749626159668, "umap_1": 6.720582485198975, "region": "US", "recipe": "Amish Friendship Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15560"}, {"umap_0": -2.338742971420288, "umap_1": 7.879636287689209, "region": "Italian", "recipe": "Sicilian Broccoli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120141"}, {"umap_0": 10.854401588439941, "umap_1": 10.377493858337402, "region": "Mexican", "recipe": "La Vida Mezcal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6541"}, {"umap_0": -0.7336722016334534, "umap_1": 5.696012020111084, "region": "Italian", "recipe": "Creamy Parmesan Garlic Polenta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119393"}, {"umap_0": 3.6414594650268555, "umap_1": 9.672829627990723, "region": "Japanese", "recipe": "Tantanmen (Tokyo Style Shoyu Sesame and Chile Ramen)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70604"}, {"umap_0": 9.877556800842285, "umap_1": 9.853166580200195, "region": "Japanese", "recipe": "Black Currant and Primitivo Wine Jam", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70732"}, {"umap_0": 1.227497935295105, "umap_1": 6.651404857635498, "region": "US", "recipe": "Pita Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14499"}, {"umap_0": 1.8189146518707275, "umap_1": 6.107285022735596, "region": "Mexican", "recipe": "Chicken Tacos With Tomatillo Ranch", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89414"}, {"umap_0": -2.254770517349243, "umap_1": 8.518220901489258, "region": "Italian", "recipe": "Bruschetta With Roasted Garlic and Cherry Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132643"}, {"umap_0": 1.1190015077590942, "umap_1": 7.961056232452393, "region": "Mexican", "recipe": "Tlalpe\u00f1o Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87190"}, {"umap_0": 0.29040607810020447, "umap_1": 10.609500885009766, "region": "Chinese and Mongolian", "recipe": "Easy Authentic Mexican Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54525"}, {"umap_0": 10.672554016113281, "umap_1": 8.970206260681152, "region": "Middle Eastern", "recipe": "Molasses Cones", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47481"}, {"umap_0": -0.7910933494567871, "umap_1": 6.165074348449707, "region": "UK", "recipe": "Wicklewood's Welsh Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105124"}, {"umap_0": 0.6991746425628662, "umap_1": 10.969401359558105, "region": "US", "recipe": "Smokey Tequila-Lime Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17526"}, {"umap_0": 9.667543411254883, "umap_1": 6.912991046905518, "region": "Belgian", "recipe": "Apple Fritters", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134394"}, {"umap_0": 0.2740262746810913, "umap_1": 12.643250465393066, "region": "Southeast Asian", "recipe": "Soup Kambing (Malaysian Lamb Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69789"}, {"umap_0": -1.5580838918685913, "umap_1": 7.757184982299805, "region": "French", "recipe": "Easy Ratatouille", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112268"}, {"umap_0": 10.338214874267578, "umap_1": 6.8388190269470215, "region": "UK", "recipe": "Christmas Cherry Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103345"}, {"umap_0": 10.939335823059082, "umap_1": 7.31587553024292, "region": "Deutschland", "recipe": "Twelfth Night Cake or King Cake(Galette Des Rois)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117523"}, {"umap_0": 9.025073051452637, "umap_1": 8.26443862915039, "region": "Indian Subcontinent", "recipe": "Paal Payasam (Indian Rice Pudding)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64656"}, {"umap_0": -1.4741584062576294, "umap_1": 10.426227569580078, "region": "Spanish and Portuguese", "recipe": "Colorful Citrus Spanish Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140437"}, {"umap_0": 11.548933029174805, "umap_1": 8.305424690246582, "region": "Italian", "recipe": "Tiramisu Cups", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129284"}, {"umap_0": -0.8882127404212952, "umap_1": 8.422369003295898, "region": "Italian", "recipe": "Salmon and Dill Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124918"}, {"umap_0": 1.6205235719680786, "umap_1": 5.287153244018555, "region": "French", "recipe": "A1 Sauce Sweet and Savory Cherry Walnut Polenta Rounds #A1", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111854"}, {"umap_0": 10.840353012084961, "umap_1": 9.385618209838867, "region": "Australian", "recipe": "Tropical Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73826"}, {"umap_0": -0.9255848526954651, "umap_1": 4.81312370300293, "region": "Italian", "recipe": "Italian Cutlets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124096"}, {"umap_0": 0.039873454719781876, "umap_1": 9.16160774230957, "region": "Italian", "recipe": "Chicken and Capers in Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127899"}, {"umap_0": -1.2212255001068115, "umap_1": 4.640003681182861, "region": "Italian", "recipe": "Pesto Manicotti (No Tomatoes)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125988"}, {"umap_0": 0.9044475555419922, "umap_1": 4.555158615112305, "region": "Canadian", "recipe": "Pecan Blue-Cheese Spread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142632"}, {"umap_0": 9.282978057861328, "umap_1": 6.8065972328186035, "region": "Eastern European", "recipe": "Nut and Poppy Seed Rolls (May, 1972)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100528"}, {"umap_0": 8.339744567871094, "umap_1": 6.843602657318115, "region": "Irish", "recipe": "Real Homemade Root Beer", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134684"}, {"umap_0": 9.054905891418457, "umap_1": 6.683651924133301, "region": "Scandinavian", "recipe": "Swedish Kringle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105982"}, {"umap_0": 1.8756550550460815, "umap_1": 7.322248935699463, "region": "Mexican", "recipe": "Crock Pot Chicken Fajitas--Easy!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88962"}, {"umap_0": -1.218416690826416, "umap_1": 5.467983245849609, "region": "Italian", "recipe": "Mushroom &amp; Veggie Sausage Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123825"}, {"umap_0": 2.0463197231292725, "umap_1": 9.473433494567871, "region": "Scandinavian", "recipe": "Flygande Jacob (Flying Jacob)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106455"}, {"umap_0": 11.035293579101562, "umap_1": 7.8755974769592285, "region": "Australian", "recipe": "Chocolate Raspberry Waffle Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74977"}, {"umap_0": 0.14760257303714752, "umap_1": 12.35356616973877, "region": "Southeast Asian", "recipe": "Indonesian Serundeng (Crisp Spiced Coconut with peanuts)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59870"}, {"umap_0": 2.4332778453826904, "umap_1": 10.291032791137695, "region": "Southeast Asian", "recipe": "Gado Gado (Veggies With Peanut Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59790"}, {"umap_0": -1.868709921836853, "umap_1": 5.509363651275635, "region": "Italian", "recipe": "Antipasto Platter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132681"}, {"umap_0": 2.060631513595581, "umap_1": 9.21524429321289, "region": "Caribbean", "recipe": "Wayne's Famous Beef Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91130"}, {"umap_0": 11.019545555114746, "umap_1": 10.436295509338379, "region": "Irish", "recipe": "Lucky Leprechaun", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136413"}, {"umap_0": 2.9133384227752686, "umap_1": 10.572333335876465, "region": "Korean", "recipe": "Korean Barbecue Tofu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69249"}, {"umap_0": 1.4277617931365967, "umap_1": 9.13318920135498, "region": "South American", "recipe": "Patti's Beef Cabbage Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96187"}, {"umap_0": -0.9506620764732361, "umap_1": 5.715970039367676, "region": "South American", "recipe": "Chili Beef Lasagne", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96069"}, {"umap_0": -0.8096973896026611, "umap_1": 7.222751617431641, "region": "Italian", "recipe": "Fettuccine Primavera", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123210"}, {"umap_0": -1.7326180934906006, "umap_1": 9.430590629577637, "region": "Greek", "recipe": "Nica Chick Orgasmic Chickpeas Bruschetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108652"}, {"umap_0": -0.21698667109012604, "umap_1": 9.42712116241455, "region": "South American", "recipe": "Beef Stroganoff Hamburger Dinner", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97778"}, {"umap_0": 0.5565001368522644, "umap_1": 6.491284370422363, "region": "French", "recipe": "Dora's Rich French Onion Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116059"}, {"umap_0": 3.490478515625, "umap_1": 9.585389137268066, "region": "Chinese and Mongolian", "recipe": "Zha Xiao Wan Zi (Chinese Fried Meatballs)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55749"}, {"umap_0": 1.04998779296875, "umap_1": 5.903123378753662, "region": "Mexican", "recipe": "Deep-Fried Chile Rellenos Wraps", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84052"}, {"umap_0": -2.2207531929016113, "umap_1": 7.246249675750732, "region": "Greek", "recipe": "Linda's Greek Pasta With Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108567"}, {"umap_0": 0.7537429332733154, "umap_1": 9.549909591674805, "region": "Caribbean", "recipe": "Trinidad Stewed Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8096"}, {"umap_0": -0.46677491068840027, "umap_1": 8.112070083618164, "region": "Italian", "recipe": "Wild Risotto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12662"}, {"umap_0": -0.19594578444957733, "umap_1": 7.825660228729248, "region": "South American", "recipe": "Tenderloin of Beef With Blue Cheese and Herb Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97863"}, {"umap_0": 10.367473602294922, "umap_1": 8.366002082824707, "region": "Canadian", "recipe": "White Chocolate and Orange Poppy Seed Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146798"}, {"umap_0": 11.226455688476562, "umap_1": 6.152523994445801, "region": "Eastern European", "recipe": "John Hadamuscin's Bohemian Butter Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142515"}, {"umap_0": -0.3826609253883362, "umap_1": 9.954937934875488, "region": "Middle Eastern", "recipe": "Lebanese Koosa (Stuffed Yellow Squash)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62903"}, {"umap_0": -0.9288316965103149, "umap_1": 8.375051498413086, "region": "Australian", "recipe": "Hopel-Popel", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72577"}, {"umap_0": -1.1559484004974365, "umap_1": 6.778564453125, "region": "Italian", "recipe": "Chicken Parmigiana", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131220"}, {"umap_0": 3.0296647548675537, "umap_1": 9.150810241699219, "region": "South American", "recipe": "Stir-Fried Beef With Mushrooms and Asparagus", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98659"}, {"umap_0": 10.417654991149902, "umap_1": 7.299461364746094, "region": "Irish", "recipe": "Irish Whiskey Brownies With Sticky Toffee &amp; Thick Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136415"}, {"umap_0": -1.0038859844207764, "umap_1": 4.265255451202393, "region": "Italian", "recipe": "Kim's Manicotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126852"}, {"umap_0": -0.4125710725784302, "umap_1": 9.570051193237305, "region": "Rest Africa", "recipe": "Tunisian-Style Catfish Nuggets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2775"}, {"umap_0": -1.6035642623901367, "umap_1": 8.764134407043457, "region": "Italian", "recipe": "Beans, Greens, and Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123493"}, {"umap_0": 10.453985214233398, "umap_1": 7.5885844230651855, "region": "Chinese and Mongolian", "recipe": "Arborio Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54356"}, {"umap_0": -2.3142261505126953, "umap_1": 7.850288391113281, "region": "Greek", "recipe": "Greek-Style Shrimp Salad on a Bed of Baby Spinach", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9571"}, {"umap_0": 9.676576614379883, "umap_1": 7.882186412811279, "region": "Rest Africa", "recipe": "Tarartir-At-Turkman (Bonnets of the Turks)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47738"}, {"umap_0": 2.258389949798584, "umap_1": 9.30316162109375, "region": "Southeast Asian", "recipe": "Indonesian Fried Rice (Nasi Goreng)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3448"}, {"umap_0": 10.483269691467285, "umap_1": 10.343303680419922, "region": "Mexican", "recipe": "Trudy's Mexican Martini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80511"}, {"umap_0": 1.6706291437149048, "umap_1": 9.118866920471191, "region": "Canadian", "recipe": "Best Meatloaf Ever !!!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143077"}, {"umap_0": 1.0075185298919678, "umap_1": 8.240187644958496, "region": "South American", "recipe": "Bucks County Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96794"}, {"umap_0": 1.5099126100540161, "umap_1": 8.985830307006836, "region": "Canadian", "recipe": "Mc Donald's Secret Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147357"}, {"umap_0": 11.254157066345215, "umap_1": 9.01680850982666, "region": "UK", "recipe": "Traditional Scottish Cranachan (Dessert for Adults)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101675"}, {"umap_0": 3.140324354171753, "umap_1": 9.999537467956543, "region": "Chinese and Mongolian", "recipe": "P F Chang's Tuna Tataki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55576"}, {"umap_0": 0.8179575800895691, "umap_1": 6.50093412399292, "region": "Italian", "recipe": "Tvp Spaghetti Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124413"}, {"umap_0": 8.218727111816406, "umap_1": 7.059240818023682, "region": "Eastern European", "recipe": "Hungarian White Bread With Fennel (Bread Machine)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105702"}, {"umap_0": -1.2702908515930176, "umap_1": 4.4477763175964355, "region": "Deutschland", "recipe": "German Tomato Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13403"}, {"umap_0": 1.2562682628631592, "umap_1": 12.764361381530762, "region": "Indian Subcontinent", "recipe": "South Indian Lentil Kootu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4470"}, {"umap_0": 10.969606399536133, "umap_1": 8.996729850769043, "region": "Mexican", "recipe": "Guilt Free Strawberry Liquada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79569"}, {"umap_0": 8.28555679321289, "umap_1": 7.844581604003906, "region": "Japanese", "recipe": "Simply Mochi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4937"}, {"umap_0": 10.519747734069824, "umap_1": 7.363301753997803, "region": "Japanese", "recipe": "Japanese Green Tea Petits Fours", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4923"}, {"umap_0": 11.137533187866211, "umap_1": 10.379852294921875, "region": "Scandinavian", "recipe": "Roxy Special", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139835"}, {"umap_0": -0.5091302394866943, "umap_1": 5.693753719329834, "region": "Canadian", "recipe": "BBQ'd Garlic Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144110"}, {"umap_0": 10.266124725341797, "umap_1": 8.742798805236816, "region": "Caribbean", "recipe": "Caribbean Pavlovas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90718"}, {"umap_0": 1.4891453981399536, "umap_1": 10.569977760314941, "region": "Middle Eastern", "recipe": "Turkish Chicken Thighs ($400. Winner)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69823"}, {"umap_0": 0.4208536744117737, "umap_1": 12.991476058959961, "region": "Indian Subcontinent", "recipe": "Stir-Fried Cauliflower &amp; Potato W- Crunchy Bengali Spices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68652"}, {"umap_0": 1.6420304775238037, "umap_1": 5.930147647857666, "region": "Mexican", "recipe": "Lamb and Bean Nachos With Salsa Fresca", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80014"}, {"umap_0": 0.41553762555122375, "umap_1": 6.457788467407227, "region": "Belgian", "recipe": "Dutch Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134469"}, {"umap_0": -1.7385210990905762, "umap_1": 8.296869277954102, "region": "Italian", "recipe": "Capers &amp; Sun-Dried Tomato Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129297"}, {"umap_0": -0.10568946599960327, "umap_1": 5.9412431716918945, "region": "Eastern European", "recipe": "Caviar Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100407"}, {"umap_0": 1.1807453632354736, "umap_1": 5.82730770111084, "region": "Mexican", "recipe": "Beth's Taco Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83637"}, {"umap_0": 10.269340515136719, "umap_1": 8.13279914855957, "region": "Scandinavian", "recipe": "Jerusalem Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9189"}, {"umap_0": -0.036388691514730453, "umap_1": 10.5352144241333, "region": "Mexican", "recipe": "Don Fernando's Green Chile Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84184"}, {"umap_0": 10.067889213562012, "umap_1": 6.525073051452637, "region": "UK", "recipe": "Tiessennau Mel (honey Cakes) Welsh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105079"}, {"umap_0": -1.534541368484497, "umap_1": 6.266122341156006, "region": "Italian", "recipe": "Bolognese Sauce - Sunday Gravy With Meat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130856"}, {"umap_0": 0.29097118973731995, "umap_1": 10.859216690063477, "region": "South American", "recipe": "Southwestern Beef Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100000"}, {"umap_0": 7.8124470710754395, "umap_1": 6.949952125549316, "region": "Italian", "recipe": "Spelt Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12283"}, {"umap_0": -1.1623494625091553, "umap_1": 6.1636433601379395, "region": "US", "recipe": "Tim's Sausage Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18073"}, {"umap_0": 1.3285937309265137, "umap_1": 6.5275797843933105, "region": "Mexican", "recipe": "Mexi Cheeseburger Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79822"}, {"umap_0": -1.0547816753387451, "umap_1": 9.46082592010498, "region": "Italian", "recipe": "Amazing Spaghetti with Seafood Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122226"}, {"umap_0": 1.477939486503601, "umap_1": 7.1679840087890625, "region": "Spanish and Portuguese", "recipe": "Tortilla Espanola (Spanish Omelet)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141172"}, {"umap_0": 0.16818618774414062, "umap_1": 9.326894760131836, "region": "South American", "recipe": "Ground Beef-Venison Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97264"}, {"umap_0": 1.1907352209091187, "umap_1": 4.826080322265625, "region": "Mexican", "recipe": "Creamy Smothered Chicken Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88837"}, {"umap_0": -1.8339776992797852, "umap_1": 9.466512680053711, "region": "Caribbean", "recipe": "Avocado Salad (Ensalada De Aguacate)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92340"}, {"umap_0": 0.17207179963588715, "umap_1": 9.50382137298584, "region": "Canadian", "recipe": "Herbed Chicken with Brown Rice Pilaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145186"}, {"umap_0": 0.44100239872932434, "umap_1": 5.775967597961426, "region": "US", "recipe": "Sinless Mississippi Sin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13847"}, {"umap_0": -0.28121545910835266, "umap_1": 7.490437030792236, "region": "US", "recipe": "Lobster Colorado", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18215"}, {"umap_0": 11.686797142028809, "umap_1": 8.539962768554688, "region": "Mexican", "recipe": "Creamy Mexican Chocolate Iced Coffee", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78842"}, {"umap_0": -1.5916171073913574, "umap_1": 8.543956756591797, "region": "Italian", "recipe": "Italian \u201cDrunken\u201d Noodles With Spicy Italian Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131564"}, {"umap_0": 10.202253341674805, "umap_1": 7.9820942878723145, "region": "US", "recipe": "Scalloped Sweet Potatoes and Apples", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14727"}, {"umap_0": 0.40166589617729187, "umap_1": 5.240620136260986, "region": "Italian", "recipe": "Mini Ham 'n' Cheese Frittatas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125634"}, {"umap_0": 1.8688045740127563, "umap_1": 6.32774543762207, "region": "Mexican", "recipe": "Chicken Tortilla Soup II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88979"}, {"umap_0": 1.5201773643493652, "umap_1": 7.182694911956787, "region": "Spanish and Portuguese", "recipe": "Spanish Tortilla With Vidalia Onions &amp; Ham", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140542"}, {"umap_0": 10.239057540893555, "umap_1": 8.936830520629883, "region": "Canadian", "recipe": "Healthy Fruit Candies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/149047"}, {"umap_0": -1.606303334236145, "umap_1": 6.47824764251709, "region": "UK", "recipe": "Easy English Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104419"}, {"umap_0": 1.9969687461853027, "umap_1": 5.155827045440674, "region": "South American", "recipe": "Too Easy Vegetable Beef Hot Dish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95703"}, {"umap_0": 9.890812873840332, "umap_1": 6.3435139656066895, "region": "Middle Eastern", "recipe": "Honey Oatmeal Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60766"}, {"umap_0": -1.5631150007247925, "umap_1": 8.137102127075195, "region": "French", "recipe": "Chicken With Shallots, Prunes, and Armagnac", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114868"}, {"umap_0": 0.37364691495895386, "umap_1": 10.663841247558594, "region": "Mexican", "recipe": "Lentil Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6013"}, {"umap_0": -0.2388976812362671, "umap_1": 7.822738170623779, "region": "French", "recipe": "Hachis Parmentier or French Style Shepherd's Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114059"}, {"umap_0": -0.7614589333534241, "umap_1": 7.115261554718018, "region": "Italian", "recipe": "Oven-Baked Red Pepper Risotto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124035"}, {"umap_0": 3.422422170639038, "umap_1": 9.77412223815918, "region": "South American", "recipe": "Beef With Bean Sprouts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99469"}, {"umap_0": -1.4373353719711304, "umap_1": 9.909008979797363, "region": "Caribbean", "recipe": "Black Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92366"}, {"umap_0": 1.6213314533233643, "umap_1": 6.147777080535889, "region": "Mexican", "recipe": "Mexican Lasagna With Black Beans and Corn", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86023"}, {"umap_0": -1.0119351148605347, "umap_1": 6.040040969848633, "region": "Italian", "recipe": "Simple Spinach Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12368"}, {"umap_0": 10.456080436706543, "umap_1": 6.678522109985352, "region": "Scandinavian", "recipe": "Swedish Cardamom Coffee Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105886"}, {"umap_0": 11.077064514160156, "umap_1": 6.847013473510742, "region": "Canadian", "recipe": "Chocolate Truffles With Liqueur", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145808"}, {"umap_0": -0.28316232562065125, "umap_1": 10.125900268554688, "region": "Middle Eastern", "recipe": "Lamb, Carrot, and White Bean Curry Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3579"}, {"umap_0": -1.6278572082519531, "umap_1": 9.381786346435547, "region": "Deutschland", "recipe": "Wurstsalat - Sausage Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137610"}, {"umap_0": 10.206049919128418, "umap_1": 6.160872459411621, "region": "Canadian", "recipe": "Warm Apricot Cobbler", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142906"}, {"umap_0": 3.456279754638672, "umap_1": 10.273784637451172, "region": "Chinese and Mongolian", "recipe": "Chinese Hot Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52512"}, {"umap_0": -0.324316143989563, "umap_1": 11.984728813171387, "region": "Caribbean", "recipe": "Weight Watcher Jerk Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91248"}, {"umap_0": 1.3149811029434204, "umap_1": 7.221072673797607, "region": "Italian", "recipe": "Methodist Spaghetti Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124533"}, {"umap_0": 0.33363932371139526, "umap_1": 11.082000732421875, "region": "Rest Africa", "recipe": "African Sweet Potato and Peanut Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49109"}, {"umap_0": 1.6657456159591675, "umap_1": 7.3172221183776855, "region": "South American", "recipe": "Desperation Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98497"}, {"umap_0": -1.363763451576233, "umap_1": 7.4157915115356445, "region": "Italian", "recipe": "Grpa's Italian Spaghetti Sauce With Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131199"}, {"umap_0": 9.219595909118652, "umap_1": 8.402389526367188, "region": "Indian Subcontinent", "recipe": "Gajar Ka Halwa (Carrot Pudding)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72147"}, {"umap_0": 7.7559814453125, "umap_1": 6.929618835449219, "region": "Italian", "recipe": "Mediterranean Black Olive Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11255"}, {"umap_0": -0.9790279865264893, "umap_1": 7.3705596923828125, "region": "Australian", "recipe": "Avocado BLT Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77594"}, {"umap_0": 1.893933892250061, "umap_1": 5.844855308532715, "region": "Mexican", "recipe": "Nacho Taco", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82522"}, {"umap_0": 1.3473308086395264, "umap_1": 8.750595092773438, "region": "Canadian", "recipe": "Rice, Corn and Cheese Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148630"}, {"umap_0": -0.34672340750694275, "umap_1": 9.231557846069336, "region": "US", "recipe": "Gluten-Free Chicken and Sausage Gumbo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14901"}, {"umap_0": 11.391111373901367, "umap_1": 8.514914512634277, "region": "Australian", "recipe": "Dolly Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74631"}, {"umap_0": -1.7576788663864136, "umap_1": 9.005260467529297, "region": "US", "recipe": "Healthy and Delicious Southern Turnip Greens", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16137"}, {"umap_0": -0.03633185103535652, "umap_1": 9.259231567382812, "region": "Mexican", "recipe": "Spicy Chorizo Stuffed Poblano Peppers With Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88000"}, {"umap_0": 3.2821478843688965, "umap_1": 8.846996307373047, "region": "Japanese", "recipe": "Lobster Broth Noodle Bowl", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70774"}, {"umap_0": -2.3274147510528564, "umap_1": 9.041618347167969, "region": "Italian", "recipe": "Southern Italian Traditional Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128508"}, {"umap_0": 9.073450088500977, "umap_1": 7.631127834320068, "region": "UK", "recipe": "Gluten Free Hot Cross Buns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102760"}, {"umap_0": 1.7751922607421875, "umap_1": 4.635721206665039, "region": "Mexican", "recipe": "Nibbler's Delight", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83665"}, {"umap_0": 3.742229461669922, "umap_1": 10.06087875366211, "region": "Chinese and Mongolian", "recipe": "Sesame Tuna Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52522"}, {"umap_0": -2.403917074203491, "umap_1": 11.73879623413086, "region": "Italian", "recipe": "Italian Seasoning II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11024"}, {"umap_0": -1.2843670845031738, "umap_1": 9.208592414855957, "region": "Northern Africa", "recipe": "Twice-Cooked Marrakech Red Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50535"}, {"umap_0": 0.23657526075839996, "umap_1": 11.859291076660156, "region": "Indian Subcontinent", "recipe": "Indian Spiced Pea Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68985"}, {"umap_0": 0.2634831368923187, "umap_1": 4.905213832855225, "region": "South American", "recipe": "Tasty Beef Spread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97026"}, {"umap_0": -1.8585253953933716, "umap_1": 9.019905090332031, "region": "Australian", "recipe": "Green Olive Relish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77489"}, {"umap_0": 1.4882398843765259, "umap_1": 8.0390625, "region": "Chinese and Mongolian", "recipe": "Mushroom Oven Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54772"}, {"umap_0": 2.7530677318573, "umap_1": 9.849610328674316, "region": "Chinese and Mongolian", "recipe": "Chinese (Ramen) Noodle Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51777"}, {"umap_0": -2.179657220840454, "umap_1": 9.200481414794922, "region": "Italian", "recipe": "Tuscan-Style Cauliflower", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120216"}, {"umap_0": 0.08531598746776581, "umap_1": 4.652676582336426, "region": "Deutschland", "recipe": "Swiss Cheese Pizzas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117852"}, {"umap_0": 0.7458925843238831, "umap_1": 10.896985054016113, "region": "Middle Eastern", "recipe": "Lamb Stuffed Eggplant (Aubergine)  or Zucchini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61402"}, {"umap_0": -1.0564689636230469, "umap_1": 5.529055595397949, "region": "Italian", "recipe": "Bruschetta Chicken Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131953"}, {"umap_0": 8.587217330932617, "umap_1": 7.322329998016357, "region": "Indian Subcontinent", "recipe": "MAL PUAS", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66662"}, {"umap_0": -1.2007917165756226, "umap_1": 8.501324653625488, "region": "Canadian", "recipe": "Baked Acorn Squash With Wild Rice Medley", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143718"}, {"umap_0": -1.7477892637252808, "umap_1": 9.9800386428833, "region": "Eastern European", "recipe": "Griby Marinovannye (Russian Marinated Mushrooms)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100697"}, {"umap_0": 10.324197769165039, "umap_1": 7.699040412902832, "region": "UK", "recipe": "Empire Biscuits", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8648"}, {"umap_0": -1.35700261592865, "umap_1": 11.375781059265137, "region": "Mexican", "recipe": "Mango Tomatillo Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83886"}, {"umap_0": 1.6215407848358154, "umap_1": 10.150081634521484, "region": "Chinese and Mongolian", "recipe": "Chinese Barbecue Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56298"}, {"umap_0": 3.040983200073242, "umap_1": 10.141079902648926, "region": "Thai", "recipe": "Sesame Thai Noodles With Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58916"}, {"umap_0": 8.212296485900879, "umap_1": 6.017678260803223, "region": "Indian Subcontinent", "recipe": "Tender Popovers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66869"}, {"umap_0": 1.4308379888534546, "umap_1": 5.747464656829834, "region": "Mexican", "recipe": "Salsa Chicken Burrito Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88724"}, {"umap_0": 10.7412691116333, "umap_1": 10.385404586791992, "region": "Caribbean", "recipe": "Almond Cigar", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91835"}, {"umap_0": 1.7323603630065918, "umap_1": 11.296564102172852, "region": "Southeast Asian", "recipe": "Marinade: Grilled Chicken With Lemongrass &amp; Chilli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62064"}, {"umap_0": -1.2564170360565186, "umap_1": 9.748128890991211, "region": "Caribbean", "recipe": "Garlicky Cuban Shredded Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92448"}, {"umap_0": -2.280280828475952, "umap_1": 9.997919082641602, "region": "Italian", "recipe": "Super Quick Pasta Al Pomodoro", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130576"}, {"umap_0": 2.1221442222595215, "umap_1": 5.257297515869141, "region": "South American", "recipe": "Make Ahead Chimichangas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95070"}, {"umap_0": -1.2632070779800415, "umap_1": 5.581155300140381, "region": "Italian", "recipe": "Easy, Healthy Vegetarian Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130047"}, {"umap_0": -1.852502703666687, "umap_1": 7.257450103759766, "region": "Italian", "recipe": "Grilled Vegetable Muffuletta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127919"}, {"umap_0": -0.024608049541711807, "umap_1": 6.620772838592529, "region": "Belgian", "recipe": "Drunken French Country Chicken Au Gratin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106671"}, {"umap_0": -1.2357099056243896, "umap_1": 6.988382816314697, "region": "Italian", "recipe": "Bolognese the Old-Fashioned Way!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122642"}, {"umap_0": 1.0163264274597168, "umap_1": 10.669305801391602, "region": "Caribbean", "recipe": "Red Bean Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90940"}, {"umap_0": 10.18138313293457, "umap_1": 8.957853317260742, "region": "UK", "recipe": "White Fruit Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8934"}, {"umap_0": 11.172503471374512, "umap_1": 7.698637962341309, "region": "French", "recipe": "Cinnamon Bun French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114388"}, {"umap_0": -1.2426837682724, "umap_1": 11.372151374816895, "region": "Rest Africa", "recipe": "Eggplant Compote", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47583"}, {"umap_0": -0.24096521735191345, "umap_1": 5.453311443328857, "region": "South American", "recipe": "Beef and Parmesan Sliders", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96326"}, {"umap_0": 9.258859634399414, "umap_1": 6.180406093597412, "region": "Canadian", "recipe": "Awesome Raisin Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145392"}, {"umap_0": 10.568404197692871, "umap_1": 10.613948822021484, "region": "Mexican", "recipe": "Chipotle Grapefruit Margarita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83910"}, {"umap_0": 9.520296096801758, "umap_1": 7.875370025634766, "region": "Australian", "recipe": "Dumplings and Cocky's Joy (Caramel Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5274"}, {"umap_0": 10.333106994628906, "umap_1": 9.611656188964844, "region": "Middle Eastern", "recipe": "Persian Fruit  Salad  Desser Miveh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60334"}, {"umap_0": 0.24702638387680054, "umap_1": 7.763823986053467, "region": "French", "recipe": "Uncle Bill's French Mushroom Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113484"}, {"umap_0": 10.794291496276855, "umap_1": 7.143560886383057, "region": "Scandinavian", "recipe": "Eggedosis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107089"}, {"umap_0": 8.341460227966309, "umap_1": 6.825126647949219, "region": "Greek", "recipe": "Quick Yeast  Dough", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110031"}, {"umap_0": 3.2276620864868164, "umap_1": 10.498610496520996, "region": "Japanese", "recipe": "Teriyaki Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71076"}, {"umap_0": 0.8737309575080872, "umap_1": 7.615300178527832, "region": "UK", "recipe": "Shepherd's Pie Nutrition Loaded Vegan, Low Carb &amp; Low Fat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103198"}, {"umap_0": 0.16963794827461243, "umap_1": 5.111111640930176, "region": "Mexican", "recipe": "Mexican Snack", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87050"}, {"umap_0": -0.7524120211601257, "umap_1": 6.655940532684326, "region": "Italian", "recipe": "Hamburger Minestrone Soup (South Beach Diet)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125717"}, {"umap_0": -1.1507580280303955, "umap_1": 11.432164192199707, "region": "Mexican", "recipe": "Fury Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79201"}, {"umap_0": 1.4013336896896362, "umap_1": 6.757198810577393, "region": "Australian", "recipe": "Creamy Chicken Curry Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73077"}, {"umap_0": -0.451552152633667, "umap_1": 10.86103630065918, "region": "Mexican", "recipe": "Poc Chuc", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83477"}, {"umap_0": 10.563416481018066, "umap_1": 7.145893096923828, "region": "Belgian", "recipe": "Dutch Almond Squares", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134019"}, {"umap_0": 1.0033942461013794, "umap_1": 9.71091365814209, "region": "Irish", "recipe": "Tortured Chicken - Beer Can", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135358"}, {"umap_0": 0.6945362091064453, "umap_1": 10.473905563354492, "region": "Mexican", "recipe": "Black Bean &amp; Corn Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83447"}, {"umap_0": 10.621631622314453, "umap_1": 6.639152526855469, "region": "French", "recipe": "Fresh Peach French Toast Cobbler", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112599"}, {"umap_0": -0.10664887726306915, "umap_1": 10.672699928283691, "region": "Mexican", "recipe": "Home-Style Refried Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78296"}, {"umap_0": -0.044455137103796005, "umap_1": 7.254788875579834, "region": "Australian", "recipe": "Australian Fish and Spring Onion Pies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76231"}, {"umap_0": 3.8564932346343994, "umap_1": 9.987544059753418, "region": "Japanese", "recipe": "Hiyashi Chuka (Japanese Summer Salad With Egg Noodles, Ham and G", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71546"}, {"umap_0": 9.169215202331543, "umap_1": 6.046225547790527, "region": "Middle Eastern", "recipe": "Neopolitan Flan (Aka Caramel Flan Cake)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61914"}, {"umap_0": -2.0501248836517334, "umap_1": 8.419882774353027, "region": "French", "recipe": "Soup a L'aille", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117138"}, {"umap_0": 0.7450438141822815, "umap_1": 10.521781921386719, "region": "Mexican", "recipe": "Y. J. H. Taco Meat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82972"}, {"umap_0": 1.9771983623504639, "umap_1": 10.00633430480957, "region": "Australian", "recipe": "Spicy Cucumber Relish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76771"}, {"umap_0": 9.545906066894531, "umap_1": 6.6639180183410645, "region": "Spanish and Portuguese", "recipe": "Portuguese Spice Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118943"}, {"umap_0": 8.845976829528809, "umap_1": 6.484364032745361, "region": "Eastern European", "recipe": "Elsie Hronek's Kolaches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142377"}, {"umap_0": 1.1501518487930298, "umap_1": 6.484608173370361, "region": "UK", "recipe": "Just an Egg", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102077"}, {"umap_0": 1.8158323764801025, "umap_1": 7.961818695068359, "region": "Chinese and Mongolian", "recipe": "Chinese Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56847"}, {"umap_0": -1.1699687242507935, "umap_1": 7.2184343338012695, "region": "Italian", "recipe": "Creamy Lemon-Pesto Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131276"}, {"umap_0": -1.8316586017608643, "umap_1": 7.719112396240234, "region": "Italian", "recipe": "Red Onion Marmalade and Ricotta Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128108"}, {"umap_0": 10.48900032043457, "umap_1": 9.961418151855469, "region": "Caribbean", "recipe": "Island Paradise Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90522"}, {"umap_0": -1.7519497871398926, "umap_1": 6.356070041656494, "region": "Italian", "recipe": "Vegetable Manicotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123000"}, {"umap_0": 2.333085536956787, "umap_1": 8.439508438110352, "region": "Indian Subcontinent", "recipe": "Frizzled Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4517"}, {"umap_0": 1.4769601821899414, "umap_1": 9.664591789245605, "region": "Rest Africa", "recipe": "Maf\u00e9", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48540"}, {"umap_0": -2.054671287536621, "umap_1": 8.32490062713623, "region": "Canadian", "recipe": "Brussels Sprout Salad With Walnuts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146692"}, {"umap_0": -1.785033941268921, "umap_1": 8.110569953918457, "region": "Italian", "recipe": "Oven Bake Zucchini Bruschetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120338"}, {"umap_0": 10.124299049377441, "umap_1": 8.89498519897461, "region": "Irish", "recipe": "Darjeeling Cranberry, Ginger and Orange Tea Loaf  (Fat-Free)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134624"}, {"umap_0": -0.7731830477714539, "umap_1": 10.115036010742188, "region": "Spanish and Portuguese", "recipe": "Bibb Salad With Basil Green Goddess Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119196"}, {"umap_0": 0.6388297080993652, "umap_1": 5.266788482666016, "region": "Greek", "recipe": "Deer Burgers (Greek Style)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110016"}, {"umap_0": 1.766107201576233, "umap_1": 9.230728149414062, "region": "Canadian", "recipe": "Wickaninnish Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148560"}, {"umap_0": 9.445097923278809, "umap_1": 6.197263240814209, "region": "Canadian", "recipe": "Ginger Molasses Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143117"}, {"umap_0": 1.5031625032424927, "umap_1": 4.811975955963135, "region": "Mexican", "recipe": "Easy Layered Mexican Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89860"}, {"umap_0": 9.703594207763672, "umap_1": 7.011175155639648, "region": "Scandinavian", "recipe": "Danish \"Brun Kager\" Brown Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101037"}, {"umap_0": 2.56091570854187, "umap_1": 10.489656448364258, "region": "Southeast Asian", "recipe": "Faux Chicken and Shrimp Pho", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62648"}, {"umap_0": 0.0743061900138855, "umap_1": 10.944851875305176, "region": "Caribbean", "recipe": "Callaloo (Creamy Spinach and Okra)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91760"}, {"umap_0": -2.2106070518493652, "umap_1": 8.656437873840332, "region": "Italian", "recipe": "Orzo and Zucchini Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10397"}, {"umap_0": -1.2049610614776611, "umap_1": 5.853065490722656, "region": "Italian", "recipe": "Pasta With Marinated Artichoke Hearts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120108"}, {"umap_0": 1.3049952983856201, "umap_1": 4.655038356781006, "region": "Mexican", "recipe": "Tex Mex Layered Taco Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87774"}, {"umap_0": -0.9617817401885986, "umap_1": 4.543949127197266, "region": "Italian", "recipe": "Easy Manicotti Florentine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12624"}, {"umap_0": 10.526522636413574, "umap_1": 6.5723466873168945, "region": "Canadian", "recipe": "Semi-Sweet Chocolate Chip Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148178"}, {"umap_0": -1.3855253458023071, "umap_1": 6.555145740509033, "region": "Irish", "recipe": "Radish, Green Onion and Herbed Goat Cheese Stuff", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136048"}, {"umap_0": 1.6904698610305786, "umap_1": 6.429791450500488, "region": "Mexican", "recipe": "Fish Taquitos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6135"}, {"umap_0": -2.776108980178833, "umap_1": 8.504410743713379, "region": "French", "recipe": "Homemade Croutons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115131"}, {"umap_0": 11.137245178222656, "umap_1": 7.688751220703125, "region": "French", "recipe": "Macadamia Nut French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116138"}, {"umap_0": -1.2505682706832886, "umap_1": 11.424125671386719, "region": "Thai", "recipe": "Nam Prik Num", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58463"}, {"umap_0": -1.730344533920288, "umap_1": 6.97441291809082, "region": "Italian", "recipe": "Shrimp parmesan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124961"}, {"umap_0": -1.742838978767395, "umap_1": 9.430853843688965, "region": "Australian", "recipe": "Fettuccine Pasta With Mushroom &amp; Cashew Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77565"}, {"umap_0": -1.2151837348937988, "umap_1": 11.578920364379883, "region": "Mexican", "recipe": "Kellibou Guacamole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87204"}, {"umap_0": 2.889756441116333, "umap_1": 9.782360076904297, "region": "Chinese and Mongolian", "recipe": "Hunan-Style Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2939"}, {"umap_0": 11.238444328308105, "umap_1": 7.771546840667725, "region": "Irish", "recipe": "Irish Cream Eggnog", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135783"}, {"umap_0": 0.23594745993614197, "umap_1": 13.804994583129883, "region": "Indian Subcontinent", "recipe": "Panch Phora (An Indian Spice)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72165"}, {"umap_0": 0.5237486362457275, "umap_1": 7.872950077056885, "region": "Deutschland", "recipe": "Bierocks (German Meat Turnovers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138383"}, {"umap_0": 10.590596199035645, "umap_1": 6.287284851074219, "region": "Mexican", "recipe": "Brownies With Kahlua-coffee Glaze", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86336"}, {"umap_0": 1.289198875427246, "umap_1": 4.7316670417785645, "region": "Mexican", "recipe": "A Healthier Taco Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83689"}, {"umap_0": 0.4529319107532501, "umap_1": 6.968672752380371, "region": "Deutschland", "recipe": "German Applesauce Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138652"}, {"umap_0": 0.8903799057006836, "umap_1": 10.439105987548828, "region": "Indian Subcontinent", "recipe": "Sarson Bhara Kekda (Shrimp With Mustard)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64016"}, {"umap_0": -0.4289466440677643, "umap_1": 7.974280834197998, "region": "French", "recipe": "Williams-Sonoma Chicken Breasts With White Wine Dijon Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113282"}, {"umap_0": -1.6823605298995972, "umap_1": 6.716250896453857, "region": "Middle Eastern", "recipe": "Onion and Chive Pitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60920"}, {"umap_0": -1.993635654449463, "umap_1": 7.573637008666992, "region": "Italian", "recipe": "Italian Sub Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120561"}, {"umap_0": -0.11163577437400818, "umap_1": 10.110419273376465, "region": "South American", "recipe": "Chilean Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93792"}, {"umap_0": 11.22740364074707, "umap_1": 7.716399192810059, "region": "Italian", "recipe": "Italian Sweet Ricotta Pie (Reduced Sugar)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120929"}, {"umap_0": -0.5217500925064087, "umap_1": 10.38526439666748, "region": "Spanish and Portuguese", "recipe": "Cauliflower Fritters", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140743"}, {"umap_0": 10.67398738861084, "umap_1": 10.782236099243164, "region": "Mexican", "recipe": "Chapala", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85225"}, {"umap_0": 9.957857131958008, "umap_1": 7.7751851081848145, "region": "UK", "recipe": "Lemon Loaf Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104135"}, {"umap_0": 3.4244654178619385, "umap_1": 9.992252349853516, "region": "Chinese and Mongolian", "recipe": "Oriental Chicken With Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57233"}, {"umap_0": 10.09994125366211, "umap_1": 6.611826419830322, "region": "Irish", "recipe": "Honeyed Apple Pratie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135727"}, {"umap_0": 1.4124523401260376, "umap_1": 8.691213607788086, "region": "South American", "recipe": "Fried Beef Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99245"}, {"umap_0": 9.007814407348633, "umap_1": 7.791365146636963, "region": "Middle Eastern", "recipe": "Oatmeal Delight", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62794"}, {"umap_0": 1.764857292175293, "umap_1": 6.5402607917785645, "region": "Mexican", "recipe": "Pork and Cornbread Mexicana", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84336"}, {"umap_0": 9.328104972839355, "umap_1": 7.645816802978516, "region": "Chinese and Mongolian", "recipe": "Lite Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54568"}, {"umap_0": 1.2491569519042969, "umap_1": 4.610762596130371, "region": "Mexican", "recipe": "Super Turkey Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86096"}, {"umap_0": -2.171379804611206, "umap_1": 9.7780122756958, "region": "Spanish and Portuguese", "recipe": "Zucchini Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142099"}, {"umap_0": -0.18648378551006317, "umap_1": 9.990968704223633, "region": "Spanish and Portuguese", "recipe": "Portuguese Potato and Cilantro Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118852"}, {"umap_0": 0.4580736458301544, "umap_1": 13.198898315429688, "region": "Indian Subcontinent", "recipe": "Chicken Achari", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72169"}, {"umap_0": 10.690690994262695, "umap_1": 6.766312599182129, "region": "Canadian", "recipe": "Rugelach (Cinnamon Chocolate Twist Cookies)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145422"}, {"umap_0": 1.1408361196517944, "umap_1": 8.997994422912598, "region": "Canadian", "recipe": "Spicy Rice, Bean and Lentil Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148970"}, {"umap_0": 3.3390579223632812, "umap_1": 9.92939567565918, "region": "Japanese", "recipe": "Japanese Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71048"}, {"umap_0": 2.6195900440216064, "umap_1": 9.598106384277344, "region": "Chinese and Mongolian", "recipe": "Orange Teriyaki Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55770"}, {"umap_0": 10.930742263793945, "umap_1": 7.514244079589844, "region": "Australian", "recipe": "The River Cafe Chocolate Nemisis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77352"}, {"umap_0": -0.4717872440814972, "umap_1": 5.430980205535889, "region": "Eastern European", "recipe": "Polish Potato Oven Omelet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133586"}, {"umap_0": -0.11827484518289566, "umap_1": 8.6296968460083, "region": "South American", "recipe": "California Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95468"}, {"umap_0": -0.28187525272369385, "umap_1": 10.13260555267334, "region": "Indian Subcontinent", "recipe": "Arabian Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64014"}, {"umap_0": 0.24582378566265106, "umap_1": 10.788599967956543, "region": "Mexican", "recipe": "Tacos al Pastor", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7006"}, {"umap_0": -2.716320514678955, "umap_1": 7.726858615875244, "region": "Greek", "recipe": "Greek Stuffed Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9612"}, {"umap_0": -0.003000135999172926, "umap_1": 9.355081558227539, "region": "Chinese and Mongolian", "recipe": "Tijuana Kitchen Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53599"}, {"umap_0": -0.5996366143226624, "umap_1": 11.267865180969238, "region": "Deutschland", "recipe": "Chanterelle Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107318"}, {"umap_0": 10.01669979095459, "umap_1": 6.219729900360107, "region": "Canadian", "recipe": "Applesauce Spice Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145300"}, {"umap_0": -0.22979630529880524, "umap_1": 8.075222969055176, "region": "Italian", "recipe": "Bagna Cauda", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132872"}, {"umap_0": -1.4859440326690674, "umap_1": 11.559985160827637, "region": "US", "recipe": "Southern Cajun Boiled Peanuts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18523"}, {"umap_0": 2.64054012298584, "umap_1": 10.404324531555176, "region": "Thai", "recipe": "Beef Stew With Star Anise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58990"}, {"umap_0": 11.430074691772461, "umap_1": 7.354372024536133, "region": "Canadian", "recipe": "Chocolate Bread Pudding With Grand Marnier Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146222"}, {"umap_0": 0.5669595003128052, "umap_1": 4.472657203674316, "region": "Italian", "recipe": "Italian Crock Pot Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133236"}, {"umap_0": -0.2746475636959076, "umap_1": 6.1436004638671875, "region": "Italian", "recipe": "Chicken Pecan Fettuccine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122372"}, {"umap_0": 9.964795112609863, "umap_1": 6.9988884925842285, "region": "Central American", "recipe": "American Kitchen Classic Orange Scones", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93067"}, {"umap_0": 1.1916712522506714, "umap_1": 10.572083473205566, "region": "Rest Africa", "recipe": "Poulet Seychelles Avec Couscous De Cilantro", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51445"}, {"umap_0": -0.0804871916770935, "umap_1": 8.373921394348145, "region": "Spanish and Portuguese", "recipe": "Chicken in Pepitoria", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141671"}, {"umap_0": -2.5202386379241943, "umap_1": 8.427446365356445, "region": "Italian", "recipe": "Scampi Appetizer Alla Buonavia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132313"}, {"umap_0": 3.305241584777832, "umap_1": 7.8580217361450195, "region": "Indian Subcontinent", "recipe": "Best Healthy Vegan Indian Naan (Or Garlic Naan) Bread!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68739"}, {"umap_0": 0.9556859135627747, "umap_1": 7.283071994781494, "region": "Canadian", "recipe": "Salmon Noodle Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144808"}, {"umap_0": -2.0338566303253174, "umap_1": 8.396404266357422, "region": "Canadian", "recipe": "Olive, Bell Pepper, &amp; Cherry Tomato Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146744"}, {"umap_0": 10.680204391479492, "umap_1": 7.35562801361084, "region": "Caribbean", "recipe": "Diplomatic Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92474"}, {"umap_0": 0.8323772549629211, "umap_1": 8.162130355834961, "region": "Italian", "recipe": "Italian Hamburger Veggie Soup Like Olive Garden Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127633"}, {"umap_0": -2.3619329929351807, "umap_1": 8.512286186218262, "region": "Greek", "recipe": "Greece is the word!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110296"}, {"umap_0": -1.4145227670669556, "umap_1": 9.945884704589844, "region": "Rest Africa", "recipe": "Small Fish Balls With Tomato Sauce (Couirat El Hout)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48460"}, {"umap_0": 1.5199891328811646, "umap_1": 10.49068546295166, "region": "Indian Subcontinent", "recipe": "Hot And Spicy Lamb Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65554"}, {"umap_0": 1.0741924047470093, "umap_1": 12.337237358093262, "region": "US", "recipe": "Summer Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16524"}, {"umap_0": -1.1129255294799805, "umap_1": 10.725693702697754, "region": "Greek", "recipe": "Grecian Pork Tenderloin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9525"}, {"umap_0": 2.6766271591186523, "umap_1": 7.287966728210449, "region": "UK", "recipe": "Toad in the Hole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101253"}, {"umap_0": -1.1776509284973145, "umap_1": 9.082773208618164, "region": "Italian", "recipe": "Chef John's Baby Porchetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12083"}, {"umap_0": 0.29402899742126465, "umap_1": 12.527606010437012, "region": "Rest Africa", "recipe": "Paprika Tomatoes With Poached Eggs (Shakshouka)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48697"}, {"umap_0": 9.571575164794922, "umap_1": 6.615438461303711, "region": "South American", "recipe": "Brazilian Honey-Spice Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93348"}, {"umap_0": -1.374618649482727, "umap_1": 7.029244422912598, "region": "French", "recipe": "Contessa's Heirloom Tomatoes With Blue Cheese Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111425"}, {"umap_0": -0.529896080493927, "umap_1": 6.634117126464844, "region": "Greek", "recipe": "Low-Fat Moussaka", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108925"}, {"umap_0": 9.487479209899902, "umap_1": 6.0801568031311035, "region": "Caribbean", "recipe": "Tropical Coconut Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90613"}, {"umap_0": 9.085963249206543, "umap_1": 9.023821830749512, "region": "Indian Subcontinent", "recipe": "Kahwah - Indian Green Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67536"}, {"umap_0": 1.8164329528808594, "umap_1": 5.24844217300415, "region": "Indian Subcontinent", "recipe": "Bombay Sandwich #SP5", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66056"}, {"umap_0": 0.4653521776199341, "umap_1": 11.802844047546387, "region": "Rest Africa", "recipe": "Spinach, Mushroom &amp; Lemon Rice Pilaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48740"}, {"umap_0": 3.2062275409698486, "umap_1": 9.970130920410156, "region": "Korean", "recipe": "Korean Grilled Beef Skewers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69244"}, {"umap_0": 10.043781280517578, "umap_1": 7.809237003326416, "region": "Australian", "recipe": "Upside-Down Caramel Banana Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74566"}, {"umap_0": -1.6933401823043823, "umap_1": 9.863703727722168, "region": "Middle Eastern", "recipe": "Almond Tabbouli Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63245"}, {"umap_0": -0.016614174470305443, "umap_1": 9.0322847366333, "region": "Deutschland", "recipe": "Bratwurst", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107404"}, {"umap_0": 9.67780876159668, "umap_1": 7.556888103485107, "region": "UK", "recipe": "Orkney Broonie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102075"}, {"umap_0": -1.360687255859375, "umap_1": 6.232619762420654, "region": "South American", "recipe": "Beef, Chilli and Red Wine Casserole With Polenta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97759"}, {"umap_0": -0.10307172685861588, "umap_1": 11.276639938354492, "region": "Caribbean", "recipe": "BBQ Caribbean Beef Kabobs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91707"}, {"umap_0": 2.862422227859497, "umap_1": 10.620397567749023, "region": "Thai", "recipe": "Spicy Thai Shrimp Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3279"}, {"umap_0": 0.3825432360172272, "umap_1": 7.953254699707031, "region": "South American", "recipe": "Lobster Risotto With Herb-Rubbed Beef Tenderloin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99093"}, {"umap_0": 1.4311424493789673, "umap_1": 8.408391952514648, "region": "Chinese and Mongolian", "recipe": "Mushroom Rice With Onion &amp; Shallots", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53560"}, {"umap_0": 1.120116949081421, "umap_1": 10.286821365356445, "region": "Mexican", "recipe": "Pinto Beans for Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83063"}, {"umap_0": 0.5684047937393188, "umap_1": 11.6340970993042, "region": "Northern Africa", "recipe": "Moroccan Tomato Lentil Stew (Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50026"}, {"umap_0": -1.9199708700180054, "umap_1": 7.6344170570373535, "region": "Canadian", "recipe": "Pizza Roll", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144480"}, {"umap_0": -1.057639241218567, "umap_1": 8.456275939941406, "region": "Spanish and Portuguese", "recipe": "Fava Bean Soup With Portuguese Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118981"}, {"umap_0": 1.4196845293045044, "umap_1": 6.074916839599609, "region": "Mexican", "recipe": "Beef or Chicken Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88189"}, {"umap_0": -1.5638091564178467, "umap_1": 8.615351676940918, "region": "Chinese and Mongolian", "recipe": "Brown Rice &amp; Goat Cheese Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53443"}, {"umap_0": 0.2731707990169525, "umap_1": 7.18326997756958, "region": "Deutschland", "recipe": "Z\u00fcrigschn\u00e4tzlets (Veal in Cream Sauce- Veal Zurich)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118278"}, {"umap_0": 0.0924263596534729, "umap_1": 10.312410354614258, "region": "Eastern European", "recipe": "Blackened Chicken Fettuccine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105296"}, {"umap_0": 0.2203356772661209, "umap_1": 12.600595474243164, "region": "Rest Africa", "recipe": "Cumin Seed Potatoes, Batata B\u2019kamun from Qutar", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47749"}, {"umap_0": 9.327796936035156, "umap_1": 10.850570678710938, "region": "Eastern European", "recipe": "Guilt-Free Russian Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100150"}, {"umap_0": 0.48706337809562683, "umap_1": 7.97437858581543, "region": "Eastern European", "recipe": "Sauteed Cabbage With Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100844"}, {"umap_0": -0.5474689602851868, "umap_1": 10.905699729919434, "region": "Rest Africa", "recipe": "Green Beans and Shrimp - Saudi Arabia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47934"}, {"umap_0": -0.7713521122932434, "umap_1": 9.198267936706543, "region": "UK", "recipe": "Lamb for Learners! Redcurrant and Honey Glazed Lamb (Crock Pot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103150"}, {"umap_0": -1.8276220560073853, "umap_1": 8.605207443237305, "region": "Japanese", "recipe": "Polenta Casserole With Sauteed Bell Peppers and Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71730"}, {"umap_0": 1.187954306602478, "umap_1": 8.14775562286377, "region": "Irish", "recipe": "Crock Pot Irish Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135960"}, {"umap_0": 2.542471408843994, "umap_1": 7.068464279174805, "region": "Italian", "recipe": "Chicago Style Deep Dish Sausage Pizza - Real Deep Dish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131952"}, {"umap_0": -0.49463245272636414, "umap_1": 6.42936897277832, "region": "Canadian", "recipe": "New Twist for Mushrooms on Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143114"}, {"umap_0": -2.257624387741089, "umap_1": 8.200695991516113, "region": "Italian", "recipe": "Aglio e Olio alla Fremont", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12242"}, {"umap_0": 10.195355415344238, "umap_1": 10.024937629699707, "region": "Indian Subcontinent", "recipe": "Lime Ice Cubes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68374"}, {"umap_0": 2.4918410778045654, "umap_1": 5.456921100616455, "region": "South American", "recipe": "Awesome Beef or Chicken Taco Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95643"}, {"umap_0": 2.157564640045166, "umap_1": 8.465614318847656, "region": "Australian", "recipe": "Pumpkin Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5578"}, {"umap_0": 10.410820960998535, "umap_1": 8.73094654083252, "region": "Deutschland", "recipe": "Brischtner Nytlae (Dried Pears in Spice Wine)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118496"}, {"umap_0": -0.24130992591381073, "umap_1": 8.735085487365723, "region": "French", "recipe": "Roti De Porc Au Lait - Pork Loin With Whole Milk", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115620"}, {"umap_0": 0.6458545327186584, "umap_1": 12.06512451171875, "region": "Middle Eastern", "recipe": "Baked Sweet Potato Falafel", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3561"}, {"umap_0": 8.338778495788574, "umap_1": 6.122470378875732, "region": "Italian", "recipe": "Friselle (Little Pepper Nuggets)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123758"}, {"umap_0": -0.04856689274311066, "umap_1": 9.370747566223145, "region": "Italian", "recipe": "Flavorful Beef, Peppers, and Onions With White Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127834"}, {"umap_0": -1.4204189777374268, "umap_1": 6.356353759765625, "region": "Italian", "recipe": "Tomato Feta Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123326"}, {"umap_0": -0.19540724158287048, "umap_1": 10.629534721374512, "region": "Mexican", "recipe": "Citrus Chicken and Pepper Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89444"}, {"umap_0": 2.6694087982177734, "umap_1": 10.78990650177002, "region": "Thai", "recipe": "Big Bowl's Chicken Pad Thai (For Two)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57985"}, {"umap_0": -0.7180554270744324, "umap_1": 9.044707298278809, "region": "Australian", "recipe": "Golden Carrot Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74572"}, {"umap_0": 1.76638662815094, "umap_1": 7.531803607940674, "region": "Deutschland", "recipe": "Simple Wiener Schnitzel", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137350"}, {"umap_0": 9.939409255981445, "umap_1": 6.940561771392822, "region": "Deutschland", "recipe": "German Cobbler Cake #RSC", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138201"}, {"umap_0": -2.582714557647705, "umap_1": 7.7991204261779785, "region": "Greek", "recipe": "Greek Salad With Pita Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107675"}, {"umap_0": 0.10040132701396942, "umap_1": 5.899693489074707, "region": "French", "recipe": "Tartiflette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115703"}, {"umap_0": -0.3396962285041809, "umap_1": 8.218035697937012, "region": "French", "recipe": "Hunters  Chicken Chasseur", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114914"}, {"umap_0": 8.236907958984375, "umap_1": 6.314720153808594, "region": "US", "recipe": "Potato Bread from Idahoan\u00ae", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16555"}, {"umap_0": 11.791729927062988, "umap_1": 9.007614135742188, "region": "Irish", "recipe": "Creme de Menthe Cheesecake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134674"}, {"umap_0": 2.1584532260894775, "umap_1": 9.890329360961914, "region": "Chinese and Mongolian", "recipe": "Wilted Greens &amp; Tofu  Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52934"}, {"umap_0": 1.5597060918807983, "umap_1": 4.754740238189697, "region": "Mexican", "recipe": "Mexican Fiesta Salad Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81915"}, {"umap_0": 1.7058980464935303, "umap_1": 7.181153297424316, "region": "French", "recipe": "Crock Pot French Dip Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115042"}, {"umap_0": 10.38370132446289, "umap_1": 8.518284797668457, "region": "Canadian", "recipe": "Liscombe Lodge Maple Syrup Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147568"}, {"umap_0": 11.451323509216309, "umap_1": 7.367675304412842, "region": "French", "recipe": "Baked Cinnamon Sugar French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115680"}, {"umap_0": 7.747557163238525, "umap_1": 6.774069786071777, "region": "Italian", "recipe": "Home Made Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129859"}, {"umap_0": 1.3871736526489258, "umap_1": 8.233147621154785, "region": "Irish", "recipe": "Another Pork Chops and Beer Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134926"}, {"umap_0": 1.8537144660949707, "umap_1": 10.456645965576172, "region": "Thai", "recipe": "Thai Green Bean Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59560"}, {"umap_0": 0.7565900683403015, "umap_1": 8.215730667114258, "region": "Eastern European", "recipe": "Hungarian Goulash (Gulas')", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105340"}, {"umap_0": 1.2058415412902832, "umap_1": 4.785986423492432, "region": "Middle Eastern", "recipe": "Oman Potato Chip Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61203"}, {"umap_0": 10.282078742980957, "umap_1": 9.412310600280762, "region": "South American", "recipe": "Mazamorra Morada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93997"}, {"umap_0": -1.2957980632781982, "umap_1": 11.212735176086426, "region": "Caribbean", "recipe": "Cuban Mango Mojo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92369"}, {"umap_0": -1.651509404182434, "umap_1": 12.028188705444336, "region": "Mexican", "recipe": "Guacamole. Avocado Cucumber Dip!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78873"}, {"umap_0": 1.6591243743896484, "umap_1": 9.01375675201416, "region": "Canadian", "recipe": "Kindersely Road House Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144213"}, {"umap_0": 8.850723266601562, "umap_1": 7.7457499504089355, "region": "Deutschland", "recipe": "Honig Brat Mandeln (Honey Roasted Almonds)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137419"}, {"umap_0": 3.3087291717529297, "umap_1": 9.057167053222656, "region": "Chinese and Mongolian", "recipe": "Pork and Squid Chinese Dumpling Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53023"}, {"umap_0": 1.1993076801300049, "umap_1": 4.328829288482666, "region": "Mexican", "recipe": "Microwave Mexican Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87951"}, {"umap_0": 0.4797804057598114, "umap_1": 7.510035514831543, "region": "Spanish and Portuguese", "recipe": "Spanish Chicken Livers (Higados De Pollo a La Espaniola)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140628"}, {"umap_0": -0.9859377145767212, "umap_1": 6.057549953460693, "region": "US", "recipe": "Steak 'n' Fries Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13970"}, {"umap_0": -1.9234247207641602, "umap_1": 8.83180046081543, "region": "Italian", "recipe": "Caponata Authentic and Delicious", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132971"}, {"umap_0": -0.5151568651199341, "umap_1": 8.250602722167969, "region": "South American", "recipe": "Roast Prime Ribs of Beef  With Cracked Pepper Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99643"}, {"umap_0": -2.306030035018921, "umap_1": 10.037104606628418, "region": "Canadian", "recipe": "Braised Fennel and Onion Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146503"}, {"umap_0": 10.015828132629395, "umap_1": 5.275490760803223, "region": "Scandinavian", "recipe": "Mom's Buttermilk Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13414"}, {"umap_0": 0.207315593957901, "umap_1": 11.756802558898926, "region": "Eastern European", "recipe": "Polish Goulash With Kielbasa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133506"}, {"umap_0": -2.4116129875183105, "umap_1": 8.637063026428223, "region": "Italian", "recipe": "Cherry Tomato, Bocconcini and Basil Bruschetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130554"}, {"umap_0": 9.35564136505127, "umap_1": 7.052667617797852, "region": "Greek", "recipe": "Tsoureki (Greek Easter Sweet Bread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107912"}, {"umap_0": -1.232099175453186, "umap_1": 7.777127265930176, "region": "Middle Eastern", "recipe": "Chicken Saute With Mozzarella Cheese (Kasarli Tavuk)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69813"}, {"umap_0": 0.16747550666332245, "umap_1": 11.451703071594238, "region": "Northern Africa", "recipe": "Moroccan-Style Lamb", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50449"}, {"umap_0": 9.638171195983887, "umap_1": 6.198990821838379, "region": "Canadian", "recipe": "Carrot Pineapple Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144489"}, {"umap_0": 0.021898040547966957, "umap_1": 10.133394241333008, "region": "Mexican", "recipe": "Good for You Taco Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83005"}, {"umap_0": 10.530296325683594, "umap_1": 7.541575908660889, "region": "Italian", "recipe": "Lemon Basil Creme Brulee", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121028"}, {"umap_0": -1.6048246622085571, "umap_1": 8.164427757263184, "region": "Italian", "recipe": "Meg's Red Sauce (Loaded With Tomatoes and Mushrooms)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129552"}, {"umap_0": -0.12072480469942093, "umap_1": 5.19794225692749, "region": "Australian", "recipe": "Mini Egg &amp; Bacon Pies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76449"}, {"umap_0": 0.8512495756149292, "umap_1": 9.65469741821289, "region": "Mexican", "recipe": "Mexican Sheppards Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79062"}, {"umap_0": -1.2206172943115234, "umap_1": 6.755017280578613, "region": "Italian", "recipe": "Beans, Greens, and Sausage Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131547"}, {"umap_0": 10.606935501098633, "umap_1": 9.433394432067871, "region": "Canadian", "recipe": "Berry Blast Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/149177"}, {"umap_0": 4.049715995788574, "umap_1": 9.372949600219727, "region": "Japanese", "recipe": "Macrobiotic Oatmeal (Gluten-Free)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70555"}, {"umap_0": 1.9353752136230469, "umap_1": 10.66810131072998, "region": "Canadian", "recipe": "Apple Maple Beer Glazed Salmon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146057"}, {"umap_0": 0.5410172939300537, "umap_1": 7.58200216293335, "region": "Italian", "recipe": "Buca Di Beppo Chicken With Lemon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123277"}, {"umap_0": -1.5258145332336426, "umap_1": 5.550809860229492, "region": "Italian", "recipe": "Italian Chicken (or Shrimp) w-Tomato Cream Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126663"}, {"umap_0": 0.28247714042663574, "umap_1": 12.702008247375488, "region": "Rest Africa", "recipe": "Pickled Okra", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48406"}, {"umap_0": -1.296913504600525, "umap_1": 9.81494426727295, "region": "Italian", "recipe": "Spaghetti &amp; Lasagna Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119340"}, {"umap_0": 1.5782115459442139, "umap_1": 7.235260963439941, "region": "French", "recipe": "Quickie French Onion Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9892"}, {"umap_0": 0.6184023022651672, "umap_1": 6.855791091918945, "region": "UK", "recipe": "Yorkshire Pudding With Herbs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105029"}, {"umap_0": 9.358656883239746, "umap_1": 5.7030415534973145, "region": "Canadian", "recipe": "Aunt Helen's Buttermilk Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146876"}, {"umap_0": 1.7690497636795044, "umap_1": 5.804295539855957, "region": "Mexican", "recipe": "Bean &amp; Charred Corn Nachos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81238"}, {"umap_0": -1.5968114137649536, "umap_1": 6.058291912078857, "region": "Italian", "recipe": "Italian Burgers With Salami &amp; Mozzarella", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121908"}, {"umap_0": 1.1434025764465332, "umap_1": 7.27559232711792, "region": "Caribbean", "recipe": "Beef Jerk Burgers ( Caribbean)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90533"}, {"umap_0": -1.2217457294464111, "umap_1": 11.69527530670166, "region": "Mexican", "recipe": "Roasted Tomatillo and Garlic Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7598"}, {"umap_0": 3.019717216491699, "umap_1": 9.367509841918945, "region": "Chinese and Mongolian", "recipe": "Chinese Long Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52869"}, {"umap_0": 1.3353976011276245, "umap_1": 4.9188079833984375, "region": "Mexican", "recipe": "Old New Mexican Breakfast Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84308"}, {"umap_0": 9.346112251281738, "umap_1": 8.884907722473145, "region": "Australian", "recipe": "Grumichama or Brazil Cherry Ice Blocks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73381"}, {"umap_0": 0.7967424988746643, "umap_1": 12.843717575073242, "region": "Indian Subcontinent", "recipe": "Imli Aur Dhaniye Ki Hari Chutney (Tamarind &amp; Cilantro Dip)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72208"}, {"umap_0": -1.6918190717697144, "umap_1": 8.513998031616211, "region": "Greek", "recipe": "Flamb\u00e9ed Shrimp With Tomatoes, Feta Cheese, and Ouzo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108153"}, {"umap_0": 9.802016258239746, "umap_1": 9.393402099609375, "region": "Thai", "recipe": "Coco - Limeade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57607"}, {"umap_0": -1.1158806085586548, "umap_1": 9.03155517578125, "region": "Greek", "recipe": "Greek Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9563"}, {"umap_0": -0.18470890820026398, "umap_1": 6.584038257598877, "region": "French", "recipe": "Fresh Cream of Tomato Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114368"}, {"umap_0": 2.9376025199890137, "umap_1": 10.095795631408691, "region": "Thai", "recipe": "Thai Veal Scaloppine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58080"}, {"umap_0": 0.14134319126605988, "umap_1": 10.777466773986816, "region": "Mexican", "recipe": "Chipotle Bean Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80074"}, {"umap_0": -2.1879806518554688, "umap_1": 9.941044807434082, "region": "Australian", "recipe": "Maple Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74171"}, {"umap_0": -0.09920617192983627, "umap_1": 8.322317123413086, "region": "Deutschland", "recipe": "German Potato Soup for the Crock Pot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139010"}, {"umap_0": 2.039147138595581, "umap_1": 6.8982768058776855, "region": "South American", "recipe": "Roast Beef with Mushroom &amp; Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97203"}, {"umap_0": 1.362606406211853, "umap_1": 9.81152057647705, "region": "South American", "recipe": "Beef Curry for Slow Cooker", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99419"}, {"umap_0": -0.49709126353263855, "umap_1": 9.022865295410156, "region": "French", "recipe": "French Lentil Soup With Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114554"}, {"umap_0": 10.13572883605957, "umap_1": 6.623961448669434, "region": "Canadian", "recipe": "My Kids Favourite Chocolate Chip Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143329"}, {"umap_0": 2.0781054496765137, "umap_1": 9.4347505569458, "region": "Australian", "recipe": "Fruity Lamb Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76607"}, {"umap_0": 9.623250007629395, "umap_1": 7.62224817276001, "region": "Eastern European", "recipe": "Russian Caramels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100226"}, {"umap_0": 0.9148565530776978, "umap_1": 12.756828308105469, "region": "Indian Subcontinent", "recipe": "Hot and Sour Tomato Broth With Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65507"}, {"umap_0": 8.530837059020996, "umap_1": 6.678146839141846, "region": "Indian Subcontinent", "recipe": "Naan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4566"}, {"umap_0": 1.6683013439178467, "umap_1": 7.894680976867676, "region": "Scandinavian", "recipe": "Porcupine Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106142"}, {"umap_0": 9.491013526916504, "umap_1": 6.482408046722412, "region": "Italian", "recipe": "Italian Biscotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120667"}, {"umap_0": 2.738337516784668, "umap_1": 9.366826057434082, "region": "Chinese and Mongolian", "recipe": "Almond Chicken Appetizers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56157"}, {"umap_0": -2.153474807739258, "umap_1": 8.227680206298828, "region": "Greek", "recipe": "Paula Deen's Greek Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108202"}, {"umap_0": 2.2133500576019287, "umap_1": 5.362441062927246, "region": "Mexican", "recipe": "Bob's Taco Soup With Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87288"}, {"umap_0": 0.38845351338386536, "umap_1": 7.730276584625244, "region": "Irish", "recipe": "Irish Hotpot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134797"}, {"umap_0": -1.2442249059677124, "umap_1": 4.64749002456665, "region": "Italian", "recipe": "Tri-Color Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122311"}, {"umap_0": 1.9040604829788208, "umap_1": 7.023955345153809, "region": "Italian", "recipe": "Irresistible Italian Corn", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10327"}, {"umap_0": -0.9400758147239685, "umap_1": 5.981141567230225, "region": "Italian", "recipe": "Speedy Gonzalez's Kielbasa Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130375"}, {"umap_0": 0.44226670265197754, "umap_1": 5.752058506011963, "region": "Australian", "recipe": "Cheesy Chicken Dippers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76820"}, {"umap_0": 1.557762622833252, "umap_1": 5.61331844329834, "region": "Mexican", "recipe": "White Chicken Enchilada Slow-Cooker Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6857"}, {"umap_0": 1.1891599893569946, "umap_1": 5.942542552947998, "region": "Mexican", "recipe": "Saturday Night Chicken, Cheese and Refried Bean Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87438"}, {"umap_0": 1.1769767999649048, "umap_1": 10.489023208618164, "region": "Indian Subcontinent", "recipe": "Indian Sweet Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66905"}, {"umap_0": 1.5091673135757446, "umap_1": 10.260404586791992, "region": "UK", "recipe": "Curried Apple and Leek Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9010"}, {"umap_0": 2.6078097820281982, "umap_1": 9.342131614685059, "region": "Canadian", "recipe": "Chicken Linguine Soup - Crock Pot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142937"}, {"umap_0": 2.086662769317627, "umap_1": 5.249919414520264, "region": "Mexican", "recipe": "Frijoles With Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84074"}, {"umap_0": -1.858019232749939, "umap_1": 7.242621898651123, "region": "Italian", "recipe": "Zucchini and Shells", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12480"}, {"umap_0": -1.8012325763702393, "umap_1": 8.58883285522461, "region": "Middle Eastern", "recipe": "Couscous Salad (Vegan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61866"}, {"umap_0": 0.4397864043712616, "umap_1": 7.415644645690918, "region": "Irish", "recipe": "Gaelic Steak Flambe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134914"}, {"umap_0": -0.0687330961227417, "umap_1": 8.473615646362305, "region": "Spanish and Portuguese", "recipe": "Lomo Al Ajillo (Pork Loin in Garlic)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140655"}, {"umap_0": -1.8862475156784058, "umap_1": 8.448264122009277, "region": "French", "recipe": "Vegan Pan Bagnat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111699"}, {"umap_0": -0.5002841353416443, "umap_1": 10.394060134887695, "region": "Mexican", "recipe": "Mayan Couscous", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6246"}, {"umap_0": -1.0137816667556763, "umap_1": 7.4154839515686035, "region": "South American", "recipe": "Beef and Lamb Meatball Kabobs With Individual Dipping Sauces", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96686"}, {"umap_0": 0.48926401138305664, "umap_1": 10.984363555908203, "region": "Mexican", "recipe": "Tacos from Scratch (Way Better Than a Packet and Just As Easy!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79500"}, {"umap_0": 9.578376770019531, "umap_1": 8.592094421386719, "region": "Middle Eastern", "recipe": "Kirazli Dondurma - Cherry Ice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69981"}, {"umap_0": 11.283649444580078, "umap_1": 8.37243938446045, "region": "Scandinavian", "recipe": "Swedish Cream with Summer Berries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9160"}, {"umap_0": 11.302166938781738, "umap_1": 8.413507461547852, "region": "Irish", "recipe": "Holiday Cookie Pops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134709"}, {"umap_0": 0.5016542077064514, "umap_1": 6.434261322021484, "region": "Indian Subcontinent", "recipe": "Ekuri - Spicy Scrambled Eggs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67896"}, {"umap_0": 1.8788880109786987, "umap_1": 8.74628734588623, "region": "Mexican", "recipe": "Mexican Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81399"}, {"umap_0": -0.9979049563407898, "umap_1": 7.649424076080322, "region": "Italian", "recipe": "Primavera With Seared Seafood", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127279"}, {"umap_0": 0.6275359392166138, "umap_1": 12.174811363220215, "region": "Southeast Asian", "recipe": "Malaysian Beef Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69792"}, {"umap_0": -1.2578353881835938, "umap_1": 8.31895923614502, "region": "UK", "recipe": "Honeyed Winter Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104865"}, {"umap_0": -1.9776275157928467, "umap_1": 7.977775573730469, "region": "Middle Eastern", "recipe": "Lebanese Fattoush", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3846"}, {"umap_0": 1.0413832664489746, "umap_1": 9.257974624633789, "region": "Indian Subcontinent", "recipe": "Kadai Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4079"}, {"umap_0": 2.988891124725342, "umap_1": 9.276805877685547, "region": "Chinese and Mongolian", "recipe": "Potstickers (Chinese Dumplings)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3089"}, {"umap_0": 0.3861425518989563, "umap_1": 10.429703712463379, "region": "Mexican", "recipe": "Chile Rojo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85757"}, {"umap_0": 9.798489570617676, "umap_1": 5.372229099273682, "region": "UK", "recipe": "Yorkshire Sausages", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103358"}, {"umap_0": -1.0138651132583618, "umap_1": 4.760813236236572, "region": "Italian", "recipe": "Protein Packed Italian Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133053"}, {"umap_0": 1.4162359237670898, "umap_1": 9.315413475036621, "region": "Eastern European", "recipe": "Mom's Hungarian Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105386"}, {"umap_0": -1.321197509765625, "umap_1": 4.770620346069336, "region": "Italian", "recipe": "Mobster's Manicotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123172"}, {"umap_0": 1.7008883953094482, "umap_1": 5.992887020111084, "region": "Mexican", "recipe": "Pupusas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85632"}, {"umap_0": 2.7506191730499268, "umap_1": 8.600227355957031, "region": "Chinese and Mongolian", "recipe": "Chinese Fried Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53375"}, {"umap_0": 1.807955265045166, "umap_1": 7.862448692321777, "region": "Mexican", "recipe": "Mexicali Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79242"}, {"umap_0": 0.12790027260780334, "umap_1": 4.979955196380615, "region": "US", "recipe": "Asparagus Casserole I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14512"}, {"umap_0": 3.2846734523773193, "umap_1": 9.370888710021973, "region": "Chinese and Mongolian", "recipe": "Wonton Soup or Fried Wonton", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55510"}, {"umap_0": 10.378767967224121, "umap_1": 8.718915939331055, "region": "Canadian", "recipe": "Currant Jelly", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147718"}, {"umap_0": 10.641512870788574, "umap_1": 6.999699115753174, "region": "UK", "recipe": "Old-Fashioned Shortbread (4 Ingredients)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101711"}, {"umap_0": -0.40492239594459534, "umap_1": 9.244776725769043, "region": "Middle Eastern", "recipe": "Mujadarra and More", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61129"}, {"umap_0": 0.0029182357247918844, "umap_1": 13.750909805297852, "region": "Chinese and Mongolian", "recipe": "Chinese Take-Out: Chinese Five Spice Powder", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53214"}, {"umap_0": 7.668046951293945, "umap_1": 5.79879093170166, "region": "French", "recipe": "Food Processor French Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113358"}, {"umap_0": 10.3917236328125, "umap_1": 9.42120361328125, "region": "Rest Africa", "recipe": "African Dessert", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47655"}, {"umap_0": -0.43921467661857605, "umap_1": 12.259627342224121, "region": "Spanish and Portuguese", "recipe": "Spanish Rice Using Quinoa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142031"}, {"umap_0": 9.913810729980469, "umap_1": 6.179925441741943, "region": "UK", "recipe": "Cacen Ddyrnu (Welsh Threshing Cake)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105088"}, {"umap_0": 0.05230466648936272, "umap_1": 8.382396697998047, "region": "Greek", "recipe": "Chicken Lemon Rice Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109638"}, {"umap_0": 3.0586671829223633, "umap_1": 8.872909545898438, "region": "Japanese", "recipe": "Miso Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70882"}, {"umap_0": -1.5325583219528198, "umap_1": 9.906431198120117, "region": "South American", "recipe": "Marinated Herbed Beef in a Salt Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96612"}, {"umap_0": 1.062806248664856, "umap_1": 7.800811767578125, "region": "South American", "recipe": "Heavenly Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99777"}, {"umap_0": 2.652691602706909, "umap_1": 9.47354793548584, "region": "Indian Subcontinent", "recipe": "Chinese Drumsticks Pakistani Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67743"}, {"umap_0": 10.367888450622559, "umap_1": 5.658212661743164, "region": "Deutschland", "recipe": "Easy Apple Kuchen", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137555"}, {"umap_0": 2.6496775150299072, "umap_1": 11.04352855682373, "region": "Thai", "recipe": "Thai Barbecue", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57618"}, {"umap_0": 0.7284457087516785, "umap_1": 9.66073226928711, "region": "Mexican", "recipe": "Rollitos de Pollo en Salsa de Guajillo (Chicken Rolls in Guajillo Pepper Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6889"}, {"umap_0": 11.651995658874512, "umap_1": 7.62949275970459, "region": "French", "recipe": "Simple French Toast Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116455"}, {"umap_0": -0.7817762494087219, "umap_1": 6.0642781257629395, "region": "Spanish and Portuguese", "recipe": "The Spanish Ranch Panini ! #RSC", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140423"}, {"umap_0": 10.050749778747559, "umap_1": 7.220972061157227, "region": "UK", "recipe": "Raspberry Scottish Shortbread ( Vegan!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101403"}, {"umap_0": -0.5603259801864624, "umap_1": 7.575573921203613, "region": "Italian", "recipe": "Fanny's Italian Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11800"}, {"umap_0": -1.7217803001403809, "umap_1": 8.684728622436523, "region": "Irish", "recipe": "Garlic and Herb Roasted Potatoes and Squash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136310"}, {"umap_0": 9.430368423461914, "umap_1": 6.7747907638549805, "region": "Canadian", "recipe": "Strawberry Sourdough Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146982"}, {"umap_0": 3.0441465377807617, "umap_1": 9.833536148071289, "region": "Chinese and Mongolian", "recipe": "Xiao Long Bao (Little Buns)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56291"}, {"umap_0": 1.0489275455474854, "umap_1": 10.118488311767578, "region": "US", "recipe": "New England Bean Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14433"}, {"umap_0": 10.32991886138916, "umap_1": 5.905421257019043, "region": "Canadian", "recipe": "Rhubarb Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144792"}, {"umap_0": -0.535754919052124, "umap_1": 5.8870368003845215, "region": "Deutschland", "recipe": "Swiss Potato and Mushroom Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117540"}, {"umap_0": 2.0968968868255615, "umap_1": 6.599760055541992, "region": "UK", "recipe": "English Beer Batter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104541"}, {"umap_0": 10.33456802368164, "umap_1": 7.327070236206055, "region": "Spanish and Portuguese", "recipe": "Toucinho Do Ceu (Heavenly Fat)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118906"}, {"umap_0": 1.831363558769226, "umap_1": 4.865825653076172, "region": "Irish", "recipe": "Stuffed Banana Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135038"}, {"umap_0": 0.017885826528072357, "umap_1": 11.440414428710938, "region": "Middle Eastern", "recipe": "Fish in Persian Sweet-And-Sour Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61494"}, {"umap_0": 11.18381404876709, "umap_1": 7.685989856719971, "region": "French", "recipe": "Overnight Eggnog French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113487"}, {"umap_0": 10.75781536102295, "umap_1": 10.99802017211914, "region": "Mexican", "recipe": "Frozen Mixed Berry Margarita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86800"}, {"umap_0": 1.2414777278900146, "umap_1": 8.073561668395996, "region": "US", "recipe": "Mauigirl's Loco Moco", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16732"}, {"umap_0": -2.1548430919647217, "umap_1": 8.517128944396973, "region": "Italian", "recipe": "Simple Caprese Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10454"}, {"umap_0": 3.3709864616394043, "umap_1": 10.169971466064453, "region": "Korean", "recipe": "Korean Fried Chicken (Spicy Version)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69343"}, {"umap_0": 2.2434241771698, "umap_1": 10.606340408325195, "region": "Australian", "recipe": "Vegemite Asian Prawns With Green Slaw", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74959"}, {"umap_0": -1.783941626548767, "umap_1": 7.800165176391602, "region": "Middle Eastern", "recipe": "Fresh Herb and Tomato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60082"}, {"umap_0": -1.0681947469711304, "umap_1": 7.536027431488037, "region": "Italian", "recipe": "My Kickin' Pasta Fagioli Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128615"}, {"umap_0": -0.25660020112991333, "umap_1": 7.814643383026123, "region": "French", "recipe": "Deconstructed Lobster Bisque", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116291"}, {"umap_0": -0.10545535385608673, "umap_1": 11.115190505981445, "region": "Caribbean", "recipe": "Trini Doubles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90344"}, {"umap_0": 9.48012638092041, "umap_1": 6.869088172912598, "region": "Spanish and Portuguese", "recipe": "Portuguese Sweet Bread for Bread Machine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118892"}, {"umap_0": -1.2693212032318115, "umap_1": 5.742815971374512, "region": "Australian", "recipe": "Pizza Topping - Smoked Salmon Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72740"}, {"umap_0": 0.6392019391059875, "umap_1": 8.06462287902832, "region": "Deutschland", "recipe": "Pheasant with Sauerkraut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138574"}, {"umap_0": -2.360288619995117, "umap_1": 8.828530311584473, "region": "Deutschland", "recipe": "My Mother's Chicken and Potatoes With Special Touches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118452"}, {"umap_0": 0.08298793435096741, "umap_1": 12.085104942321777, "region": "Indian Subcontinent", "recipe": "Fish and  Cabbage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72373"}, {"umap_0": -0.40015560388565063, "umap_1": 4.595405578613281, "region": "French", "recipe": "Pepperoni Souffle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112428"}, {"umap_0": 1.0597927570343018, "umap_1": 9.302349090576172, "region": "Mexican", "recipe": "Leftover Steak Green Chili Burros Slow Cooker", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81532"}, {"umap_0": -1.2263686656951904, "umap_1": 11.830113410949707, "region": "Mexican", "recipe": "Pico de Gallo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7836"}, {"umap_0": -1.1378872394561768, "umap_1": 10.97275161743164, "region": "Australian", "recipe": "Onion Marmalade for New Zealand Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76975"}, {"umap_0": 10.368918418884277, "umap_1": 7.893019676208496, "region": "UK", "recipe": "Lemon--Cranberry Shortbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101898"}, {"umap_0": 0.12785731256008148, "umap_1": 8.886496543884277, "region": "Greek", "recipe": "Beef Stifado", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9449"}, {"umap_0": 2.873894214630127, "umap_1": 10.279242515563965, "region": "Japanese", "recipe": "Shrimp in Mild Tomato Chili Sauce (Ebi No Chiri So-Su)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70436"}, {"umap_0": 1.377915859222412, "umap_1": 10.808008193969727, "region": "Rest Africa", "recipe": "East African Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48816"}, {"umap_0": -0.621149480342865, "umap_1": 7.715200424194336, "region": "Chinese and Mongolian", "recipe": "Chinese Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51968"}, {"umap_0": -1.9091218709945679, "umap_1": 6.969133377075195, "region": "Italian", "recipe": "Genoese Minestrone", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128503"}, {"umap_0": -0.13178449869155884, "umap_1": 10.678367614746094, "region": "Mexican", "recipe": "Chihuahua-Style Salsa Verde", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7975"}, {"umap_0": 8.130681037902832, "umap_1": 6.629275798797607, "region": "Mexican", "recipe": "Soft Corn Tortilla Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85070"}, {"umap_0": -1.1368415355682373, "umap_1": 5.316522121429443, "region": "Italian", "recipe": "Baked Butternut Squash Gratin - Giada De Laurentiis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130588"}, {"umap_0": 0.16225770115852356, "umap_1": 9.77759838104248, "region": "Mexican", "recipe": "Chicken Enchiladas Verde", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89214"}, {"umap_0": 0.8651372194290161, "umap_1": 5.820237159729004, "region": "Italian", "recipe": "Bacon, Sour Cream, Potato Manicotti With Gorgonzola Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119679"}, {"umap_0": 8.680720329284668, "umap_1": 7.362706661224365, "region": "Australian", "recipe": "Gluten-Free All-Purpose Flour Mix #1", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75989"}, {"umap_0": -0.12426546961069107, "umap_1": 8.102157592773438, "region": "Belgian", "recipe": "Braised Belgian Endive", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106612"}, {"umap_0": 10.735156059265137, "umap_1": 10.2648344039917, "region": "Caribbean", "recipe": "Bahama Lullaby", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90409"}, {"umap_0": -1.5713587999343872, "umap_1": 5.383267402648926, "region": "Canadian", "recipe": "California Cauliflower", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145806"}, {"umap_0": -1.838204026222229, "umap_1": 9.752130508422852, "region": "Deutschland", "recipe": "Swiss Baked Tomatoes in Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118292"}, {"umap_0": -1.9509501457214355, "umap_1": 9.508674621582031, "region": "Italian", "recipe": "Summer Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126027"}, {"umap_0": 11.799275398254395, "umap_1": 8.082860946655273, "region": "Deutschland", "recipe": "German Chocolate Coconut Bars", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139106"}, {"umap_0": 2.805629253387451, "umap_1": 9.534446716308594, "region": "Southeast Asian", "recipe": "Pancit", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63765"}, {"umap_0": 0.13560540974140167, "umap_1": 8.411050796508789, "region": "Canadian", "recipe": "Sauteed Potatoes and Leeks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144021"}, {"umap_0": 1.5124220848083496, "umap_1": 9.755524635314941, "region": "Indian Subcontinent", "recipe": "Slowburn Goat Curry With Friend Roti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68442"}, {"umap_0": -2.026599884033203, "umap_1": 9.886940956115723, "region": "South American", "recipe": "Venezuelan Beef Roast Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94059"}, {"umap_0": 2.2772741317749023, "umap_1": 7.4765849113464355, "region": "Italian", "recipe": "Aromatic Bread Dumplings", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131601"}, {"umap_0": -0.10391520708799362, "umap_1": 4.592491626739502, "region": "Australian", "recipe": "Avocado, Ham and Cheese Melt", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76723"}, {"umap_0": 2.2713582515716553, "umap_1": 10.485796928405762, "region": "Southeast Asian", "recipe": "Indonesian Pork Satay", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3438"}, {"umap_0": 0.8210528492927551, "umap_1": 8.437171936035156, "region": "Italian", "recipe": "Ravioli Meat Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131151"}, {"umap_0": -0.986822247505188, "umap_1": 8.382224082946777, "region": "Italian", "recipe": "Mushroom Lovers' Turkey Cacciatore", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122536"}, {"umap_0": -1.4210366010665894, "umap_1": 10.303997993469238, "region": "Australian", "recipe": "Couscous With Ginger, Orange, Almond &amp; Herbs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75558"}, {"umap_0": 1.4648970365524292, "umap_1": 5.342130184173584, "region": "Mexican", "recipe": "Southwest Baked Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86325"}, {"umap_0": 0.2557952105998993, "umap_1": 9.102579116821289, "region": "Canadian", "recipe": "Apple Onion Sauce for Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148338"}, {"umap_0": -2.27400803565979, "umap_1": 8.722634315490723, "region": "Italian", "recipe": "Braised Ligurian Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126592"}, {"umap_0": 9.238080024719238, "umap_1": 6.174140930175781, "region": "South American", "recipe": "Banana Upside Cake (bolo De Banana)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93454"}, {"umap_0": 2.551232099533081, "umap_1": 9.633914947509766, "region": "US", "recipe": "Five Spice Turkey Cheeseburgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16752"}, {"umap_0": 1.4470648765563965, "umap_1": 8.176922798156738, "region": "Deutschland", "recipe": "Simply the Best  Hot German Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138865"}, {"umap_0": 1.2717951536178589, "umap_1": 12.364078521728516, "region": "Indian Subcontinent", "recipe": "Basmati Rice with Vegetables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66486"}, {"umap_0": -0.16624312102794647, "umap_1": 10.866996765136719, "region": "Northern Africa", "recipe": "Moroccan Braised Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49660"}, {"umap_0": -0.4595973789691925, "umap_1": 5.337884902954102, "region": "Mexican", "recipe": "Poblano Queso Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79590"}, {"umap_0": 10.196186065673828, "umap_1": 6.824253559112549, "region": "Irish", "recipe": "Irish Apple Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13101"}, {"umap_0": -0.4888237416744232, "umap_1": 9.748560905456543, "region": "Middle Eastern", "recipe": "Lentils With Wild Rice &amp; Crispy Onions (Koshari)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47338"}, {"umap_0": 0.4378003478050232, "umap_1": 12.573949813842773, "region": "Mexican", "recipe": "Spaghetti Squash With Mexican Spices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78463"}, {"umap_0": -0.7579576969146729, "umap_1": 8.894984245300293, "region": "Italian", "recipe": "Sicilian Chicken With Lemon, Mint and Almonds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122673"}, {"umap_0": 11.66150951385498, "umap_1": 7.368163585662842, "region": "Italian", "recipe": "Batista's Cappuccino", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121239"}, {"umap_0": 1.0617879629135132, "umap_1": 4.742095947265625, "region": "US", "recipe": "Texas Stuffed Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17405"}, {"umap_0": -1.9260987043380737, "umap_1": 8.716657638549805, "region": "Greek", "recipe": "Taramousalata (Greek Caviar Spread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9366"}, {"umap_0": 0.7939980626106262, "umap_1": 10.385493278503418, "region": "Indian Subcontinent", "recipe": "Grilled Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72364"}, {"umap_0": -1.2129461765289307, "umap_1": 11.186966896057129, "region": "Mexican", "recipe": "Green Chili Salsa Dip (Beware Very Hot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79195"}, {"umap_0": 0.44562482833862305, "umap_1": 11.186622619628906, "region": "Northern Africa", "recipe": "Couscous With Savory Caramelized Onion", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49546"}, {"umap_0": 1.7255510091781616, "umap_1": 6.847947597503662, "region": "Italian", "recipe": "Italian Beef Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123914"}, {"umap_0": -2.6131300926208496, "umap_1": 8.673362731933594, "region": "French", "recipe": "Figs With Fennel, Goat Cheese and Green Peppercorn Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114276"}, {"umap_0": -1.6937246322631836, "umap_1": 9.167866706848145, "region": "Australian", "recipe": "Sarah's Fabulous Summer Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72916"}, {"umap_0": 1.787846326828003, "umap_1": 10.631842613220215, "region": "Chinese and Mongolian", "recipe": "Baked Curry Chicken With a Side of Coconut Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53905"}, {"umap_0": -0.8051272630691528, "umap_1": 10.017407417297363, "region": "Scandinavian", "recipe": "Danish Coleslaw - K\u00e5lsalat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140142"}, {"umap_0": 2.686178207397461, "umap_1": 7.480287551879883, "region": "Greek", "recipe": "Spinach Feta Pizza II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110105"}, {"umap_0": 1.936732530593872, "umap_1": 9.649888038635254, "region": "Canadian", "recipe": "Canadian Pork Roast With Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146884"}, {"umap_0": 1.1046338081359863, "umap_1": 5.846712112426758, "region": "Mexican", "recipe": "Angela's Awesome Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6836"}, {"umap_0": 2.3830513954162598, "umap_1": 10.378469467163086, "region": "Chinese and Mongolian", "recipe": "Curried Shrimp With Brown Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54903"}, {"umap_0": -2.4874606132507324, "umap_1": 9.287787437438965, "region": "Italian", "recipe": "Rosemary-Scented, Extra-Crispy Fried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121889"}, {"umap_0": 10.922629356384277, "umap_1": 8.385458946228027, "region": "Canadian", "recipe": "Apple Bread Pudding With Maple Rum Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148779"}, {"umap_0": 3.0882670879364014, "umap_1": 9.804500579833984, "region": "Chinese and Mongolian", "recipe": "Emerald Dragon Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55971"}, {"umap_0": -1.9299217462539673, "umap_1": 7.339765548706055, "region": "French", "recipe": "Paves De Saumon Au Jambon De Bayonne (Salmon Steaks With Ham)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116087"}, {"umap_0": -1.3202654123306274, "umap_1": 5.64713716506958, "region": "Italian", "recipe": "Spinach Stuffed Chicken Breast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130749"}, {"umap_0": 10.50543212890625, "umap_1": 7.773617744445801, "region": "Australian", "recipe": "Sugar Crust Sponge", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77431"}, {"umap_0": 2.072361707687378, "umap_1": 7.6413187980651855, "region": "Eastern European", "recipe": "Chicken Kiev", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8560"}, {"umap_0": -1.7715682983398438, "umap_1": 7.237468242645264, "region": "Italian", "recipe": "Tuscan Cream Cheese Spread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132853"}, {"umap_0": 1.8808650970458984, "umap_1": 8.346580505371094, "region": "Chinese and Mongolian", "recipe": "Chinese Fried Beef and Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53365"}, {"umap_0": 3.478912591934204, "umap_1": 9.913973808288574, "region": "Japanese", "recipe": "Gyoza ( Pot Stickers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70727"}, {"umap_0": 10.820918083190918, "umap_1": 6.653618812561035, "region": "Canadian", "recipe": "Gingered Chocolate Whipped Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144010"}, {"umap_0": 7.882758140563965, "umap_1": 5.812241554260254, "region": "Mexican", "recipe": "' Fnky' Flour Tortillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87691"}, {"umap_0": 3.675859212875366, "umap_1": 9.842616081237793, "region": "Japanese", "recipe": "Plum Salad Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71114"}, {"umap_0": -1.7821314334869385, "umap_1": 8.905890464782715, "region": "Italian", "recipe": "Venus Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122175"}, {"umap_0": -1.0263582468032837, "umap_1": 4.592350959777832, "region": "Italian", "recipe": "Cheese and Nut Stuffed Shells", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131057"}, {"umap_0": 0.7894139289855957, "umap_1": 7.219599723815918, "region": "South American", "recipe": "Marsala Creamed Beef With Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99656"}, {"umap_0": 3.4995975494384766, "umap_1": 9.765250205993652, "region": "Chinese and Mongolian", "recipe": "Chinese Orange-Sesame Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55217"}, {"umap_0": 0.1961808055639267, "umap_1": 7.55633020401001, "region": "Middle Eastern", "recipe": "Terbiyeli Kereviz (Poached Celeriac)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69874"}, {"umap_0": 2.3259527683258057, "umap_1": 10.292511940002441, "region": "Southeast Asian", "recipe": "Vietnamese Warm Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62422"}, {"umap_0": -0.9323740005493164, "umap_1": 11.338334083557129, "region": "Mexican", "recipe": "Mexican Low Fat Citrus Salad Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82408"}, {"umap_0": -0.8552679419517517, "umap_1": 4.24830961227417, "region": "Deutschland", "recipe": "Potato, Egg and Ham Bake (Bauern Fr\u00fchstuck)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137488"}, {"umap_0": 12.06760025024414, "umap_1": 7.778079032897949, "region": "Canadian", "recipe": "Chocolate-Peanut Crisp Bars", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148346"}, {"umap_0": 1.5701814889907837, "umap_1": 9.666003227233887, "region": "Rest Africa", "recipe": "Casbah Chicken with Orange Infused Basmati Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48872"}, {"umap_0": -0.60016930103302, "umap_1": 7.991176128387451, "region": "Middle Eastern", "recipe": "Grilled Chicken Skewers With Hummus-Yogurt Dipping Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61251"}, {"umap_0": 1.0263564586639404, "umap_1": 8.08423900604248, "region": "Middle Eastern", "recipe": "Middle Eastern Lamb and Bean Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61547"}, {"umap_0": 1.7628854513168335, "umap_1": 5.835239887237549, "region": "Mexican", "recipe": "Greg's Pinto Beans - Refried Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77900"}, {"umap_0": 8.52304744720459, "umap_1": 6.319225788116455, "region": "French", "recipe": "Savarin With Berries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113728"}, {"umap_0": -1.792480707168579, "umap_1": 7.952422618865967, "region": "Italian", "recipe": "Italian Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125218"}, {"umap_0": 0.8909968137741089, "umap_1": 11.691295623779297, "region": "Rest Africa", "recipe": "Doro Wat (Spicy Chicken Stew)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49156"}, {"umap_0": 2.3138632774353027, "umap_1": 10.02312183380127, "region": "Southeast Asian", "recipe": "Eggplant (Aubergine) Adobo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63753"}, {"umap_0": -1.6291037797927856, "umap_1": 7.161388397216797, "region": "Italian", "recipe": "Pasta Roma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127934"}, {"umap_0": 11.494688987731934, "umap_1": 6.804284572601318, "region": "US", "recipe": "Deb's Passover Brownies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14643"}, {"umap_0": -0.6093863844871521, "umap_1": 11.441083908081055, "region": "South American", "recipe": "Oven-Roasted Tenderloin (Beef, Pork, Even Venison)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97098"}, {"umap_0": -1.3691362142562866, "umap_1": 9.727387428283691, "region": "Italian", "recipe": "Rosemary Egg Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12428"}, {"umap_0": 11.390667915344238, "umap_1": 6.852691650390625, "region": "UK", "recipe": "Peanut-Butter Flapjacks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103030"}, {"umap_0": 0.8971931338310242, "umap_1": 12.64441204071045, "region": "Indian Subcontinent", "recipe": "Indian  Butter Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65213"}, {"umap_0": 0.18060825765132904, "umap_1": 10.762374877929688, "region": "Mexican", "recipe": "Instant Pot\u00ae Charro (Refried Beans)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6186"}, {"umap_0": 0.9632858037948608, "umap_1": 7.276350021362305, "region": "Scandinavian", "recipe": "Easy Swedish Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139910"}, {"umap_0": 10.606472969055176, "umap_1": 10.26840877532959, "region": "Australian", "recipe": "Lime Sunset Non-Alcoholic Cocktail", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75744"}, {"umap_0": 10.591907501220703, "umap_1": 7.810356616973877, "region": "French", "recipe": "Lavender Nougat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112872"}, {"umap_0": 9.035015106201172, "umap_1": 5.901548385620117, "region": "Canadian", "recipe": "Most Tender Pie Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143104"}, {"umap_0": 0.38223251700401306, "umap_1": 5.967268943786621, "region": "Italian", "recipe": "Cjalzons - Friulian Pasta Filled With Savory Sweet Stuffing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124869"}, {"umap_0": 0.5889646410942078, "umap_1": 6.238109588623047, "region": "Canadian", "recipe": "Cheese Krispies Crackers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143953"}, {"umap_0": -1.3035809993743896, "umap_1": 8.116023063659668, "region": "Italian", "recipe": "Fast Eddie's Hustlin' Eggplant", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120391"}, {"umap_0": -1.6137914657592773, "umap_1": 12.053178787231445, "region": "Central American", "recipe": "Salsa from Costa Rica!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92751"}, {"umap_0": 3.221585988998413, "umap_1": 10.252674102783203, "region": "Chinese and Mongolian", "recipe": "Xihongshi Jiang - Stir-Fried Shrimp in Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56397"}, {"umap_0": -2.6721746921539307, "umap_1": 7.969666481018066, "region": "South American", "recipe": "Heirloom Tomato Salad With Grilled Garlic Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100061"}, {"umap_0": 10.209290504455566, "umap_1": 8.166001319885254, "region": "Mexican", "recipe": "Flan De Cafe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87743"}, {"umap_0": 0.6465972065925598, "umap_1": 5.177090644836426, "region": "South American", "recipe": "S.l. Hot Roast Beef Party Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95785"}, {"umap_0": 9.115842819213867, "umap_1": 8.394359588623047, "region": "Indian Subcontinent", "recipe": "Rice Pudding (Kheer)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5129"}, {"umap_0": 10.87574577331543, "umap_1": 7.493214130401611, "region": "Mexican", "recipe": "Simply the Best Mexican Flan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81000"}, {"umap_0": 0.8154590725898743, "umap_1": 11.127267837524414, "region": "Irish", "recipe": "Baked Corned Beef Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136547"}, {"umap_0": -1.821033239364624, "umap_1": 9.254093170166016, "region": "Italian", "recipe": "Drink from the Bowl Spaghetti Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128890"}, {"umap_0": -1.8258922100067139, "umap_1": 9.031458854675293, "region": "Mexican", "recipe": "Black Beans and Rice(arroz Con Frijoles Negros)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78257"}, {"umap_0": 10.204330444335938, "umap_1": 5.338412284851074, "region": "Italian", "recipe": "Italian Soft Dead Man's Bones", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121147"}, {"umap_0": 2.3421366214752197, "umap_1": 8.676328659057617, "region": "US", "recipe": "Sizzling Steak Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15352"}, {"umap_0": 9.618403434753418, "umap_1": 6.705162525177002, "region": "Canadian", "recipe": "Agave Lavender Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148948"}, {"umap_0": 2.175485134124756, "umap_1": 7.622127532958984, "region": "Spanish and Portuguese", "recipe": "Fried Calamari", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141118"}, {"umap_0": 10.707467079162598, "umap_1": 10.320048332214355, "region": "US", "recipe": "Hawaii Five-O", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16759"}, {"umap_0": 10.739253044128418, "umap_1": 7.623933792114258, "region": "UK", "recipe": "Yummy Low Cal-Low Fat Pineapple Bread Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103641"}, {"umap_0": 11.19957447052002, "umap_1": 7.613311767578125, "region": "US", "recipe": "Run For The Roses Pie II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18317"}, {"umap_0": -0.2407880425453186, "umap_1": 11.315374374389648, "region": "Mexican", "recipe": "Mexican Lentil Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86849"}, {"umap_0": 0.09521909803152084, "umap_1": 9.362201690673828, "region": "Middle Eastern", "recipe": "Chickpea Patties in Pitas With Cilantro Yogurt", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60474"}, {"umap_0": -0.47090598940849304, "umap_1": 6.503218650817871, "region": "Italian", "recipe": "Italian Breadcrumb-Crusted Tilapia W-Lemon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124448"}, {"umap_0": -1.6213449239730835, "umap_1": 7.207785606384277, "region": "Italian", "recipe": "Italian Snapper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121777"}, {"umap_0": 1.3110674619674683, "umap_1": 5.751152992248535, "region": "Mexican", "recipe": "Mexican 9-Layer Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84085"}, {"umap_0": 1.878241777420044, "umap_1": 8.753812789916992, "region": "South American", "recipe": "Meme's Calico Baked Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96924"}, {"umap_0": -0.8955507278442383, "umap_1": 5.35370397567749, "region": "Italian", "recipe": "Lasagna in the Pressure Cooker", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131117"}, {"umap_0": 11.097428321838379, "umap_1": 6.373406410217285, "region": "Deutschland", "recipe": "German Sour Cream Twists", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137808"}, {"umap_0": 0.007022353820502758, "umap_1": 11.734325408935547, "region": "Australian", "recipe": "Curried Pickled Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76772"}, {"umap_0": 0.45874902606010437, "umap_1": 7.338892459869385, "region": "Eastern European", "recipe": "Frugal Gourmet's Polish Noodles and Cabbage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133360"}, {"umap_0": -0.1575111746788025, "umap_1": 5.806654453277588, "region": "US", "recipe": "Joelle's Famous Hot Crab and Artichoke Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16811"}, {"umap_0": 2.90606427192688, "umap_1": 7.009469985961914, "region": "Italian", "recipe": "Chewy Italian Batter Breadsticks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119751"}, {"umap_0": 9.997720718383789, "umap_1": 6.364442348480225, "region": "Scandinavian", "recipe": "The Ultimate Breakfast Cookie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139209"}, {"umap_0": -0.500583827495575, "umap_1": 6.3571929931640625, "region": "Canadian", "recipe": "Lite Stuffed Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146603"}, {"umap_0": 9.942673683166504, "umap_1": 6.4778242111206055, "region": "Scandinavian", "recipe": "Gingersnaps (Pepparkakor)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105870"}, {"umap_0": 3.0555219650268555, "umap_1": 10.28598690032959, "region": "Korean", "recipe": "Korean Barbecue Wet Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69136"}, {"umap_0": 11.022294044494629, "umap_1": 8.061348915100098, "region": "French", "recipe": "Millefeuille With Vanilla Cream and Chocolate Mousse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112432"}, {"umap_0": -1.3980135917663574, "umap_1": 10.499991416931152, "region": "Rest Africa", "recipe": "Kele Wele", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48300"}, {"umap_0": 1.1709452867507935, "umap_1": 5.586130619049072, "region": "Mexican", "recipe": "Chile Con Queso (Chile-Cheese Dip)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83516"}, {"umap_0": -0.18024864792823792, "umap_1": 12.244661331176758, "region": "Indian Subcontinent", "recipe": "Masoor Dhal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65935"}, {"umap_0": 1.813773512840271, "umap_1": 8.55335521697998, "region": "Chinese and Mongolian", "recipe": "Southern Rice With Bacon Flavored Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53696"}, {"umap_0": -0.571422278881073, "umap_1": 8.44444751739502, "region": "Italian", "recipe": "Italian Baby Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119590"}, {"umap_0": 9.599403381347656, "umap_1": 9.754493713378906, "region": "UK", "recipe": "orange marmalade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104262"}, {"umap_0": 1.1669400930404663, "umap_1": 10.129842758178711, "region": "Middle Eastern", "recipe": "Duck Fesenjan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2855"}, {"umap_0": -0.6406524181365967, "umap_1": 11.46854305267334, "region": "Caribbean", "recipe": "Pique (Puerto Rican Style Hot Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92074"}, {"umap_0": -1.5321619510650635, "umap_1": 10.56371021270752, "region": "Greek", "recipe": "Chickpeas With Spinach (Greek)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110400"}, {"umap_0": -1.3721771240234375, "umap_1": 5.206984519958496, "region": "Italian", "recipe": "Wishbone Italian Grilled Chicken Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126702"}, {"umap_0": -0.7962071299552917, "umap_1": 5.5524492263793945, "region": "Italian", "recipe": "Eggplant (Aubergine) Rollatini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131406"}, {"umap_0": -2.2322609424591064, "umap_1": 7.801762104034424, "region": "Greek", "recipe": "Brown Rice With Spinach and Feta Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108296"}, {"umap_0": 1.2572479248046875, "umap_1": 9.441682815551758, "region": "US", "recipe": "Spiced Pumpkin Seeds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18420"}, {"umap_0": 1.802507758140564, "umap_1": 9.266167640686035, "region": "Indian Subcontinent", "recipe": "Broasted Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72359"}, {"umap_0": -0.09042278677225113, "umap_1": 9.84847354888916, "region": "Rest Africa", "recipe": "Chickpea and Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47588"}, {"umap_0": -1.0725185871124268, "umap_1": 6.658015251159668, "region": "Italian", "recipe": "Linguine With Bacon and Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127338"}, {"umap_0": 10.785246849060059, "umap_1": 9.027557373046875, "region": "Australian", "recipe": "Florentines With Berry Ice Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73862"}, {"umap_0": -0.3742625415325165, "umap_1": 11.10119915008545, "region": "Rest Africa", "recipe": "Tunisian Fish Cakes With Lemon-Paprika A\u00efoli (Passover)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49309"}, {"umap_0": 0.23900194466114044, "umap_1": 9.156368255615234, "region": "Deutschland", "recipe": "Swiss Pork and Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118228"}, {"umap_0": -0.03058946318924427, "umap_1": 6.014091491699219, "region": "Mexican", "recipe": "Cheese Soup With Jalape\u00f1o Pesto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88022"}, {"umap_0": 1.4410419464111328, "umap_1": 7.9549078941345215, "region": "Mexican", "recipe": "Spanish Rice Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82914"}, {"umap_0": 0.13814035058021545, "umap_1": 5.951913833618164, "region": "Italian", "recipe": "Chicken and Asparagus Fettuccine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11577"}, {"umap_0": 0.7318601012229919, "umap_1": 9.806900978088379, "region": "Mexican", "recipe": "Veggie Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84548"}, {"umap_0": 1.459996223449707, "umap_1": 9.857269287109375, "region": "Caribbean", "recipe": "Curried Potato Chips", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91801"}, {"umap_0": 3.524404287338257, "umap_1": 9.201982498168945, "region": "Korean", "recipe": "Korean Soybean Noodles (Kong Kook Su)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4756"}, {"umap_0": -1.940906047821045, "umap_1": 6.084734916687012, "region": "Italian", "recipe": "Penne Pasta with Veggies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12043"}, {"umap_0": -1.498612403869629, "umap_1": 8.07089614868164, "region": "Italian", "recipe": "Vegan Sausage Rigatoni", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125480"}, {"umap_0": 11.236673355102539, "umap_1": 6.95028829574585, "region": "French", "recipe": "Denny's-Style French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115238"}, {"umap_0": 1.5257461071014404, "umap_1": 6.217804908752441, "region": "Mexican", "recipe": "Turkey Soft Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5743"}, {"umap_0": -0.5946857333183289, "umap_1": 6.078432559967041, "region": "Northern Africa", "recipe": "Tunisian Egg Briks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50064"}, {"umap_0": -0.23371565341949463, "umap_1": 12.626899719238281, "region": "Mexican", "recipe": "Lemon-Chili Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79075"}, {"umap_0": 1.839859962463379, "umap_1": 6.068512439727783, "region": "Mexican", "recipe": "Sweet and Savory Black Bean Plaintain Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85725"}, {"umap_0": -2.038372039794922, "umap_1": 7.396195888519287, "region": "Italian", "recipe": "Pasta Ai Fiori Di Zucca (Pasta With Zucchini Blossoms)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12210"}, {"umap_0": 10.95815658569336, "umap_1": 6.722874164581299, "region": "French", "recipe": "Nutella Brioche Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117234"}, {"umap_0": -1.9898791313171387, "umap_1": 9.743003845214844, "region": "UK", "recipe": "Tomato Butter Tea Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104630"}, {"umap_0": 3.3392016887664795, "umap_1": 10.062098503112793, "region": "Korean", "recipe": "My Auntie's Real Bulgogi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4694"}, {"umap_0": -1.6173770427703857, "umap_1": 8.124360084533691, "region": "Italian", "recipe": "Aphrodisiac Tagliatelle with Blue Cheese Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12274"}, {"umap_0": -1.9416553974151611, "umap_1": 7.549081325531006, "region": "Italian", "recipe": "Super Easy Italian Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131858"}, {"umap_0": -0.26277557015419006, "umap_1": 7.222153186798096, "region": "Scandinavian", "recipe": "Strawberry Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106910"}, {"umap_0": 0.4316873848438263, "umap_1": 10.186958312988281, "region": "Mexican", "recipe": "Slow Cooker Pozole (Chicken and Pork)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85566"}, {"umap_0": -1.8928779363632202, "umap_1": 9.587043762207031, "region": "Greek", "recipe": "Marinated Feta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109586"}, {"umap_0": 1.2385129928588867, "umap_1": 5.460117340087891, "region": "Central American", "recipe": "South American Pizza With Chorizo and Chili Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93002"}, {"umap_0": -0.6707525253295898, "umap_1": 7.432048797607422, "region": "Australian", "recipe": "Light Smoked Fish Pie With Kumara Mash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75118"}, {"umap_0": 0.7068323493003845, "umap_1": 5.762147426605225, "region": "Middle Eastern", "recipe": "Kurdish Kahdeh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70069"}, {"umap_0": 1.5810576677322388, "umap_1": 5.0056962966918945, "region": "Mexican", "recipe": "Chorizo Sausage Chihuahua Dogs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87365"}, {"umap_0": 0.671999454498291, "umap_1": 12.363466262817383, "region": "Indian Subcontinent", "recipe": "Chicken in Gravy (Indian-Punjabi Style)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67385"}, {"umap_0": -0.8012382984161377, "umap_1": 9.176366806030273, "region": "South American", "recipe": "Feijao Na Pressao (Brazilian Black Beans in the Pressure Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8314"}, {"umap_0": 1.0655608177185059, "umap_1": 8.46761703491211, "region": "South American", "recipe": "feijoada (brazilian black bean stew)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93429"}, {"umap_0": 10.204320907592773, "umap_1": 8.843903541564941, "region": "French", "recipe": "Easy Chai Creme Caramel (Aka Chai Flan De Leche)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111952"}, {"umap_0": 10.617517471313477, "umap_1": 6.73212194442749, "region": "Belgian", "recipe": "Dutch Vandermint Torte With Vandermint Coffee Frosting", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134010"}, {"umap_0": 1.1044844388961792, "umap_1": 9.912679672241211, "region": "US", "recipe": "Buffalo Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17622"}, {"umap_0": 0.8588863611221313, "umap_1": 7.437868118286133, "region": "French", "recipe": "My Mom's French Canadian Cretons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113988"}, {"umap_0": 3.0241997241973877, "umap_1": 10.527693748474121, "region": "Mexican", "recipe": "Foley's Grilled Salmon With Soy Ginger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79398"}, {"umap_0": -0.8556214570999146, "umap_1": 8.183527946472168, "region": "Italian", "recipe": "Pollo Alla Montanara (Chicken in Wine Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129634"}, {"umap_0": 0.16890136897563934, "umap_1": 11.511482238769531, "region": "Indian Subcontinent", "recipe": "Indian Cauliflower and Kidney Bean Stew With Coconut Milk", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66998"}, {"umap_0": 10.652979850769043, "umap_1": 10.26740550994873, "region": "Australian", "recipe": "Drunken Watermelon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76384"}, {"umap_0": 3.737821102142334, "umap_1": 9.937808990478516, "region": "Japanese", "recipe": "Japanese Cabbage Salad with Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71077"}, {"umap_0": 0.01138520147651434, "umap_1": 8.325850486755371, "region": "Scandinavian", "recipe": "Chicken Fricass\u00e9e With Creamy Sweet-And-Sour Dill Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139682"}, {"umap_0": 2.902860403060913, "umap_1": 9.898431777954102, "region": "Chinese and Mongolian", "recipe": "Deb's General Tso's Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2896"}, {"umap_0": 3.3006980419158936, "umap_1": 10.426973342895508, "region": "Australian", "recipe": "Chilli Plum Crabs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77413"}, {"umap_0": -1.1858551502227783, "umap_1": 7.470452785491943, "region": "Italian", "recipe": "Original Carrabba's Chicken Bryan (Not Copycat!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124919"}, {"umap_0": -1.8793965578079224, "umap_1": 5.698782444000244, "region": "Italian", "recipe": "Easy Italian Vegetable Packets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125318"}, {"umap_0": -0.09685136377811432, "umap_1": 5.617732524871826, "region": "Chinese and Mongolian", "recipe": "Buttered Parmesan Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53803"}, {"umap_0": 10.380111694335938, "umap_1": 7.766465187072754, "region": "Canadian", "recipe": "Lemon Soymilk Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148979"}, {"umap_0": 9.107453346252441, "umap_1": 6.891109943389893, "region": "Scandinavian", "recipe": "Rosettes I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13419"}, {"umap_0": -0.004933270625770092, "umap_1": 4.822179317474365, "region": "Australian", "recipe": "Cheese &amp; Vegemite Mini Quiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77559"}, {"umap_0": 0.6233452558517456, "umap_1": 10.268095970153809, "region": "Caribbean", "recipe": "Caribbean Vegetable Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91741"}, {"umap_0": -0.2816180884838104, "umap_1": 6.0629730224609375, "region": "Deutschland", "recipe": "Scallops in Swiss Cheese Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117950"}, {"umap_0": -2.117828369140625, "umap_1": 8.800236701965332, "region": "Northern Africa", "recipe": "Red Pepper, Lemon and Mint Couscous", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49548"}, {"umap_0": 0.9907320141792297, "umap_1": 4.8527021408081055, "region": "Mexican", "recipe": "Mexican Cheese Snacks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80855"}, {"umap_0": 11.093069076538086, "umap_1": 7.582452297210693, "region": "Australian", "recipe": "Flourless Chocolate Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74499"}, {"umap_0": 2.122055768966675, "umap_1": 5.0107102394104, "region": "South American", "recipe": "Shredded Beef Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96506"}, {"umap_0": -1.4233676195144653, "umap_1": 6.529354572296143, "region": "Italian", "recipe": "Spaghetti with Beef and Mushroom Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127768"}, {"umap_0": 9.445223808288574, "umap_1": 6.677623748779297, "region": "Australian", "recipe": "Chocolate and Almond Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75575"}, {"umap_0": -1.8812910318374634, "umap_1": 10.176521301269531, "region": "Italian", "recipe": "Pasta With Garlic and Oil (Pasta Aglio E Olio)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131484"}, {"umap_0": -1.4044510126113892, "umap_1": 9.174473762512207, "region": "French", "recipe": "Artichoke Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111806"}, {"umap_0": -1.857147455215454, "umap_1": 9.701681137084961, "region": "French", "recipe": "Confit D'oignon - Onion Preserve", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112460"}, {"umap_0": 10.007903099060059, "umap_1": 5.882359981536865, "region": "Canadian", "recipe": "Mom,s Raisin Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144271"}, {"umap_0": 0.7440477013587952, "umap_1": 7.946354866027832, "region": "French", "recipe": "Chicken in a Pot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113869"}, {"umap_0": -2.482712507247925, "umap_1": 8.086446762084961, "region": "Greek", "recipe": "Healthy Greek Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107714"}, {"umap_0": -1.9129527807235718, "umap_1": 6.97475528717041, "region": "Italian", "recipe": "Appetizing Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11344"}, {"umap_0": 0.7719407677650452, "umap_1": 5.481480598449707, "region": "Mexican", "recipe": "Jalapeno Popper Quesadillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82203"}, {"umap_0": 3.3700122833251953, "umap_1": 10.080263137817383, "region": "Canadian", "recipe": "Orange-apricot Prawns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144799"}, {"umap_0": 0.7927509546279907, "umap_1": 6.950196266174316, "region": "Belgian", "recipe": "Belgian Style Carrot Coins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106693"}, {"umap_0": 1.1569143533706665, "umap_1": 8.28869342803955, "region": "South American", "recipe": "Cheesy Ground Beef and Rice Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94665"}, {"umap_0": -0.6531607508659363, "umap_1": 10.195749282836914, "region": "Indian Subcontinent", "recipe": "Chucumber (Indian Cucumber Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64524"}, {"umap_0": 1.734744668006897, "umap_1": 7.619813919067383, "region": "Deutschland", "recipe": "Falscher Hase (False Hare German Meatballs)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138294"}, {"umap_0": 2.5987367630004883, "umap_1": 10.643059730529785, "region": "Southeast Asian", "recipe": "Indonesian Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59860"}, {"umap_0": 2.723863124847412, "umap_1": 8.864019393920898, "region": "South American", "recipe": "Coconut Curry Beef Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94858"}, {"umap_0": 0.9650956988334656, "umap_1": 4.880723476409912, "region": "US", "recipe": "Mexican Chicken Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18074"}, {"umap_0": -1.263389229774475, "umap_1": 8.040308952331543, "region": "US", "recipe": "Killer Shrimp Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17945"}, {"umap_0": 11.297856330871582, "umap_1": 7.8658447265625, "region": "French", "recipe": "Baked Cinnamon French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115143"}, {"umap_0": 11.402894020080566, "umap_1": 8.030360221862793, "region": "Deutschland", "recipe": "Black Forest Cherry Cheesecake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137850"}, {"umap_0": 0.11632750183343887, "umap_1": 11.490843772888184, "region": "Middle Eastern", "recipe": "Tagine of Lamb with Quinces", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61662"}, {"umap_0": -1.4687013626098633, "umap_1": 8.50658130645752, "region": "Italian", "recipe": "Chicken Piccata (South Beach Phase 1)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126525"}, {"umap_0": -1.2568435668945312, "umap_1": 7.199586391448975, "region": "Italian", "recipe": "Shaved Brussels Sprouts With Pancetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119915"}, {"umap_0": 0.09890953451395035, "umap_1": 9.473420143127441, "region": "Canadian", "recipe": "Pictou County Pizza Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144697"}, {"umap_0": -0.05035017803311348, "umap_1": 8.57482624053955, "region": "Deutschland", "recipe": "Simple Viennese Beef Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107297"}, {"umap_0": 2.7068521976470947, "umap_1": 10.448904037475586, "region": "Japanese", "recipe": "Japanese Salmon Burger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70943"}, {"umap_0": 10.349726676940918, "umap_1": 7.970216751098633, "region": "Deutschland", "recipe": "Zimtmakronen - Cinnamon Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137381"}, {"umap_0": 9.381885528564453, "umap_1": 6.707773208618164, "region": "Deutschland", "recipe": "Mom's Kuchen", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137283"}, {"umap_0": 0.9319941997528076, "umap_1": 8.81534481048584, "region": "Caribbean", "recipe": "Rice and Peas With Ham", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90735"}, {"umap_0": 10.162126541137695, "umap_1": 5.602627277374268, "region": "Canadian", "recipe": "Maple Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147185"}, {"umap_0": 0.9786609411239624, "umap_1": 11.45162582397461, "region": "Mexican", "recipe": "Light Flour Tortillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84775"}, {"umap_0": 10.572319030761719, "umap_1": 10.323277473449707, "region": "Mexican", "recipe": "Mango, Chilli and Lime Margarita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83066"}, {"umap_0": 0.7196156978607178, "umap_1": 13.115609169006348, "region": "Indian Subcontinent", "recipe": "Shrimp in Coconut Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67691"}, {"umap_0": 2.4053750038146973, "umap_1": 9.922897338867188, "region": "Chinese and Mongolian", "recipe": "Stir-Fried Chicken With Pineapple and Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2934"}, {"umap_0": 3.3454458713531494, "umap_1": 9.315509796142578, "region": "Chinese and Mongolian", "recipe": "Chinese Beef With Anise Hyssop", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53004"}, {"umap_0": 10.13790225982666, "umap_1": 8.951178550720215, "region": "French", "recipe": "Honey-Soaked Goat Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113255"}, {"umap_0": 0.6283879280090332, "umap_1": 9.493538856506348, "region": "Mexican", "recipe": "Chicken and Black Bean Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6700"}, {"umap_0": 11.76770305633545, "umap_1": 8.455057144165039, "region": "Canadian", "recipe": "No Bake Cream Cheese Dessert", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145719"}, {"umap_0": 0.039625704288482666, "umap_1": 12.190072059631348, "region": "Indian Subcontinent", "recipe": "Chicken Tikka Mumbai", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65204"}, {"umap_0": 0.9696589708328247, "umap_1": 10.34250545501709, "region": "Thai", "recipe": "Thai Red Curry With Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58022"}, {"umap_0": 10.620315551757812, "umap_1": 10.249897003173828, "region": "Canadian", "recipe": "Tropical Strawberry Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143962"}, {"umap_0": 1.2627520561218262, "umap_1": 5.777937889099121, "region": "Mexican", "recipe": "Grilled Breakfast Burrito", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82869"}, {"umap_0": 0.7471975684165955, "umap_1": 7.868103504180908, "region": "Australian", "recipe": "Veal Birds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75949"}, {"umap_0": 10.360438346862793, "umap_1": 8.682806015014648, "region": "Central American", "recipe": "South American Raisin Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92959"}, {"umap_0": 2.3578615188598633, "umap_1": 9.635660171508789, "region": "Australian", "recipe": "Cocktail Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75802"}, {"umap_0": -2.773643732070923, "umap_1": 8.544965744018555, "region": "Italian", "recipe": "Pan-Fried Garlic Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130407"}, {"umap_0": -0.48247095942497253, "umap_1": 8.675886154174805, "region": "French", "recipe": "Snails Sommeroise - Escargots a la Sommeroise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113207"}, {"umap_0": 8.992899894714355, "umap_1": 6.524967670440674, "region": "Eastern European", "recipe": "Choereg (Armenian Easter Bread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13538"}, {"umap_0": 3.416114568710327, "umap_1": 9.989638328552246, "region": "Australian", "recipe": "Asian Lettuce Wraps", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5308"}, {"umap_0": 8.348167419433594, "umap_1": 6.987048625946045, "region": "Rest Africa", "recipe": "Steamed Cornbread-Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47557"}, {"umap_0": 9.833870887756348, "umap_1": 7.041590690612793, "region": "UK", "recipe": "Green Tea Shortbread Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101467"}, {"umap_0": 1.2901328802108765, "umap_1": 10.511434555053711, "region": "South American", "recipe": "Dr Pepper Beef Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99886"}, {"umap_0": -0.23262551426887512, "umap_1": 10.578851699829102, "region": "Caribbean", "recipe": "Roasted Corn, Black Bean, and Mango Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91037"}, {"umap_0": 0.5840230584144592, "umap_1": 9.741433143615723, "region": "Mexican", "recipe": "Mom's Carne Guisada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7540"}, {"umap_0": 10.445120811462402, "umap_1": 8.04233169555664, "region": "Indian Subcontinent", "recipe": "Cheese Pudding With Raisins and Almonds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66517"}, {"umap_0": 0.48519623279571533, "umap_1": 8.37837028503418, "region": "Irish", "recipe": "Irish Pot Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134890"}, {"umap_0": -2.2729949951171875, "umap_1": 6.701333999633789, "region": "Middle Eastern", "recipe": "Falafel Pitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60903"}, {"umap_0": 0.4105159640312195, "umap_1": 10.130810737609863, "region": "Central American", "recipe": "American Women's Club of Luxembourg Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92842"}, {"umap_0": 8.838973045349121, "umap_1": 6.7364182472229, "region": "Indian Subcontinent", "recipe": "Roti Canai_Paratha (Indian Pancake)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4609"}, {"umap_0": 0.05471676215529442, "umap_1": 7.252633094787598, "region": "Mexican", "recipe": "360 Cookware Chorizo Bean Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79989"}, {"umap_0": 2.023939847946167, "umap_1": 4.727809429168701, "region": "Mexican", "recipe": "Taco Casserole Surprise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87725"}, {"umap_0": 1.1099916696548462, "umap_1": 12.706733703613281, "region": "Indian Subcontinent", "recipe": "Anjeeri Khumb with only 3 tsps. oil and maraschino cherries!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64894"}, {"umap_0": 1.1657395362854004, "umap_1": 8.06629467010498, "region": "Canadian", "recipe": "Smoked Pork Chops With Potatoes &amp; Sauerkraut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147740"}, {"umap_0": 12.081353187561035, "umap_1": 4.686682224273682, "region": "Deutschland", "recipe": "Spaetzle - German", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138779"}, {"umap_0": 0.2857591509819031, "umap_1": 6.614966869354248, "region": "Deutschland", "recipe": "German Potato Salad With Keilbasa #5FIX", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137184"}, {"umap_0": -0.37483659386634827, "umap_1": 6.65553617477417, "region": "French", "recipe": "French Onion Tomato Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113544"}, {"umap_0": 0.7989017367362976, "umap_1": 9.281834602355957, "region": "Mexican", "recipe": "Nopalitos Con Huevos Breakfast Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86660"}, {"umap_0": 0.7640136480331421, "umap_1": 12.481929779052734, "region": "Indian Subcontinent", "recipe": "Tuna Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64275"}, {"umap_0": -2.495041608810425, "umap_1": 9.036615371704102, "region": "Italian", "recipe": "Olive Oil, Garlic, and Crushed Red Pepper Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119576"}, {"umap_0": -0.6572386622428894, "umap_1": 10.325284004211426, "region": "Middle Eastern", "recipe": "Iranian Sugar Pickled Garlic", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60582"}, {"umap_0": 1.4814096689224243, "umap_1": 6.053394794464111, "region": "South American", "recipe": "Papas Huancaino", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93837"}, {"umap_0": 10.918684959411621, "umap_1": 6.646834373474121, "region": "Deutschland", "recipe": "Chocolate Filled Almond Spritz Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138340"}, {"umap_0": -1.8504774570465088, "umap_1": 10.791175842285156, "region": "Australian", "recipe": "Succulent Prawns for the Barbie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75535"}, {"umap_0": -2.435455322265625, "umap_1": 6.750731945037842, "region": "Italian", "recipe": "Basil Butter Garlic Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132352"}, {"umap_0": 1.0257591009140015, "umap_1": 9.099783897399902, "region": "Greek", "recipe": "Dolma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9374"}, {"umap_0": 1.4252159595489502, "umap_1": 8.448905944824219, "region": "Chinese and Mongolian", "recipe": "Chicken With Szechuan Rice for 2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57164"}, {"umap_0": 1.1825228929519653, "umap_1": 10.686308860778809, "region": "Australian", "recipe": "Savoury Pumpkin Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74887"}, {"umap_0": 9.329658508300781, "umap_1": 6.513767242431641, "region": "Belgian", "recipe": "Liege Waffle Recipe - Li\u00e8ge Gaufre Recette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106568"}, {"umap_0": 1.5835986137390137, "umap_1": 6.910691261291504, "region": "South American", "recipe": "Easy Beef \u2018n\u2019 Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98468"}, {"umap_0": 0.8244311213493347, "umap_1": 8.260699272155762, "region": "Deutschland", "recipe": "Pork Chops and Pierogies for 2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136971"}, {"umap_0": 10.9827299118042, "umap_1": 8.54349136352539, "region": "UK", "recipe": "Traditional English Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103800"}, {"umap_0": 0.7938772439956665, "umap_1": 12.604090690612793, "region": "Indian Subcontinent", "recipe": "Chicken or Lamb Biriyani", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64236"}, {"umap_0": 0.772128701210022, "umap_1": 12.36756706237793, "region": "Indian Subcontinent", "recipe": "Baked Fish Wrapped in Banana Leaves", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68717"}, {"umap_0": 1.202843427658081, "umap_1": 11.409436225891113, "region": "Rest Africa", "recipe": "Sweet  N' Spicy Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51388"}, {"umap_0": -1.6273596286773682, "umap_1": 10.540791511535645, "region": "US", "recipe": "White Bean Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18213"}, {"umap_0": -0.3158065974712372, "umap_1": 8.270655632019043, "region": "Canadian", "recipe": "Seared Pork Medallions With Saskatoon (Or Blueberry) and Rhubarb", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147201"}, {"umap_0": -0.035181835293769836, "umap_1": 9.522159576416016, "region": "Mexican", "recipe": "Two Bean Caserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78596"}, {"umap_0": -0.2078840285539627, "umap_1": 5.976397514343262, "region": "Mexican", "recipe": "Spinach and Mushroom Quesadillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79915"}, {"umap_0": 1.077321171760559, "umap_1": 8.21718692779541, "region": "French", "recipe": "Brasserie's Le Coze's French Onion Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114236"}, {"umap_0": 9.993802070617676, "umap_1": 5.464186668395996, "region": "South American", "recipe": "Heirloom Raisin Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100055"}, {"umap_0": 0.37413763999938965, "umap_1": 8.716294288635254, "region": "Belgian", "recipe": "Pennsylvania Dutch Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134357"}, {"umap_0": 2.247965097427368, "umap_1": 11.000914573669434, "region": "Indian Subcontinent", "recipe": "Turkey Meatball Korma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68864"}, {"umap_0": 9.75698471069336, "umap_1": 7.453516006469727, "region": "Indian Subcontinent", "recipe": "Sandesh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67002"}, {"umap_0": 2.091667652130127, "umap_1": 6.456871509552002, "region": "Mexican", "recipe": "Enchiladas Coloradas (Red Enchiladas)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80572"}, {"umap_0": 1.86276376247406, "umap_1": 10.717509269714355, "region": "Chinese and Mongolian", "recipe": "Asian Spicy Plum Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51848"}, {"umap_0": 2.530447006225586, "umap_1": 10.041291236877441, "region": "Southeast Asian", "recipe": "Cheat n' Eat Vietnamese Chicken Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62512"}, {"umap_0": 1.0116397142410278, "umap_1": 12.314139366149902, "region": "Indian Subcontinent", "recipe": "Aromatic Peas and Carrot Pilaf for one", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64705"}, {"umap_0": 1.7215689420700073, "umap_1": 10.256852149963379, "region": "Southeast Asian", "recipe": "Myrna's Curried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63749"}, {"umap_0": -0.9524049162864685, "umap_1": 10.427093505859375, "region": "Mexican", "recipe": "Seafood Ceviche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7066"}, {"umap_0": -0.25578081607818604, "umap_1": 6.638718128204346, "region": "Caribbean", "recipe": "Trinidadian Macaroni Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91367"}, {"umap_0": -0.8399298787117004, "umap_1": 7.697590351104736, "region": "Deutschland", "recipe": "Easy German Potato Salad #5FIX", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137811"}, {"umap_0": -0.7990944385528564, "umap_1": 5.186559677124023, "region": "Italian", "recipe": "Asparagus Milanese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132751"}, {"umap_0": -0.05589376762509346, "umap_1": 8.074045181274414, "region": "Deutschland", "recipe": "Veal Steaks", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137690"}, {"umap_0": 1.852655053138733, "umap_1": 10.20706558227539, "region": "Chinese and Mongolian", "recipe": "Steamed Sea Bass with Black Bean Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57192"}, {"umap_0": -1.7756612300872803, "umap_1": 8.199274063110352, "region": "Italian", "recipe": "Michael Chiarello's Pickled Antipasto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132715"}, {"umap_0": -2.453618049621582, "umap_1": 9.88633918762207, "region": "UK", "recipe": "Beetroot &amp; Horseradish Puree", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102512"}, {"umap_0": 0.4707120954990387, "umap_1": 6.374220371246338, "region": "Australian", "recipe": "Tassie Curry Scallop Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75158"}, {"umap_0": 1.3775466680526733, "umap_1": 10.290270805358887, "region": "South American", "recipe": "Prairie Fire Beef Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96182"}, {"umap_0": -0.4943050146102905, "umap_1": 8.376326560974121, "region": "Italian", "recipe": "Southern Italian Pasta Fagioli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127393"}, {"umap_0": -1.4104938507080078, "umap_1": 5.244559288024902, "region": "Italian", "recipe": "Zucchini Manicotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125556"}, {"umap_0": 0.49602025747299194, "umap_1": 6.53398323059082, "region": "Chinese and Mongolian", "recipe": "Cajun Quiche in a Rice Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54409"}, {"umap_0": 10.416463851928711, "umap_1": 7.738397598266602, "region": "Australian", "recipe": "Pancakes With Vanilla Banana (Using an Egg Replacer)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74010"}, {"umap_0": 1.4576456546783447, "umap_1": 5.499892711639404, "region": "Mexican", "recipe": "Cinco de Mayo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80232"}, {"umap_0": 2.1327641010284424, "umap_1": 10.662544250488281, "region": "Australian", "recipe": "Rack of Lamb With Chilli-Mint Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77406"}, {"umap_0": -0.9710488319396973, "umap_1": 9.235594749450684, "region": "Chinese and Mongolian", "recipe": "Florida Beans and Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54232"}, {"umap_0": 1.3386688232421875, "umap_1": 9.817499160766602, "region": "Rest Africa", "recipe": "Peanut (Groundnut) Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48987"}, {"umap_0": -0.7660301923751831, "umap_1": 9.261104583740234, "region": "Rest Africa", "recipe": "Preserved Lemons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48174"}, {"umap_0": 0.6717744469642639, "umap_1": 5.91477108001709, "region": "Deutschland", "recipe": "Nudeln Mit Spinat Und Muskatnuss (Noodles W. Spinach and Nutmeg)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137013"}, {"umap_0": 0.5330303907394409, "umap_1": 6.743196964263916, "region": "Scandinavian", "recipe": "Dromskinka (Dream Ham)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9153"}, {"umap_0": 9.152359962463379, "umap_1": 6.921017169952393, "region": "Scandinavian", "recipe": "Gluten Free Finnish (Hoito) Pancake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139557"}, {"umap_0": 1.2150393724441528, "umap_1": 9.744861602783203, "region": "French", "recipe": "Onion Straws (French Fried Onions)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115865"}, {"umap_0": -1.9553046226501465, "umap_1": 9.343260765075684, "region": "Australian", "recipe": "Caramelised Onion &amp; Goats Cheese Omelette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73935"}, {"umap_0": 3.6354057788848877, "umap_1": 10.077268600463867, "region": "Korean", "recipe": "Wanjakuk (Korean Meatball Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69292"}, {"umap_0": -1.2453991174697876, "umap_1": 5.132853031158447, "region": "Italian", "recipe": "Sausage Zucchini Boats", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123708"}, {"umap_0": 2.2950439453125, "umap_1": 9.934898376464844, "region": "Southeast Asian", "recipe": "Acar Campur (Pickled Vegetables)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59795"}, {"umap_0": -1.4231547117233276, "umap_1": 7.371896743774414, "region": "Canadian", "recipe": "Bear Italiano - Bear Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145070"}, {"umap_0": -0.5152446627616882, "umap_1": 10.045548439025879, "region": "Mexican", "recipe": "Crock Pot Mexicali Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78045"}, {"umap_0": 10.830865859985352, "umap_1": 7.895906925201416, "region": "French", "recipe": "Baked Blueberry Pecan French Toast With Blueberry Syrup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111721"}, {"umap_0": -1.1581099033355713, "umap_1": 9.701723098754883, "region": "Italian", "recipe": "Bomba Calabrese (Spicy Calabrian Pepper Spread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11040"}, {"umap_0": -0.6526889204978943, "umap_1": 10.805870056152344, "region": "Chinese and Mongolian", "recipe": "Prawn Sang Choy Bow", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55309"}, {"umap_0": 0.6108986139297485, "umap_1": 4.784968852996826, "region": "Mexican", "recipe": "Oops Tex Mex Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78434"}, {"umap_0": -1.1193559169769287, "umap_1": 5.877293109893799, "region": "Italian", "recipe": "Meatballs Ala Angela", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130922"}, {"umap_0": 10.286702156066895, "umap_1": 7.304640769958496, "region": "Spanish and Portuguese", "recipe": "Pastissets (Powdered Sugar Cookies from Spain)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140546"}, {"umap_0": 0.4062589704990387, "umap_1": 10.26050090789795, "region": "Caribbean", "recipe": "Caribbean Chicken Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90497"}, {"umap_0": 0.033274564892053604, "umap_1": 8.684553146362305, "region": "Central American", "recipe": "Martha Stewart's All American Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92963"}, {"umap_0": 3.0481154918670654, "umap_1": 10.062190055847168, "region": "Japanese", "recipe": "Beef Kushiyaki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4995"}, {"umap_0": -0.8106408715248108, "umap_1": 6.62211275100708, "region": "Italian", "recipe": "Egg Noodle Casserole With Italian Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119647"}, {"umap_0": 1.5702959299087524, "umap_1": 6.314182281494141, "region": "Mexican", "recipe": "Mexican Vegetable Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87879"}, {"umap_0": 1.1275120973587036, "umap_1": 5.459662437438965, "region": "Mexican", "recipe": "Longhorn Chili Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84158"}, {"umap_0": -0.8691716194152832, "umap_1": 9.091316223144531, "region": "Italian", "recipe": "Italian Sausage, Sicilian Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130070"}, {"umap_0": 0.06120110675692558, "umap_1": 7.997251987457275, "region": "Japanese", "recipe": "Bacon Mushroom Burger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70786"}, {"umap_0": 3.1422221660614014, "umap_1": 9.054335594177246, "region": "Chinese and Mongolian", "recipe": "Savory Chinese Beef (Crock Pot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56049"}, {"umap_0": -1.2090941667556763, "umap_1": 8.22006607055664, "region": "Mexican", "recipe": "Mexican Night Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81434"}, {"umap_0": -0.8484283685684204, "umap_1": 8.321549415588379, "region": "Australian", "recipe": "Sage and Macadamia Pork Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75037"}, {"umap_0": 10.7693510055542, "umap_1": 10.191994667053223, "region": "Australian", "recipe": "Vanilla Ginger Sour Cocktail", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72676"}, {"umap_0": 0.10555657744407654, "umap_1": 9.530282974243164, "region": "Middle Eastern", "recipe": "Lamb and Bulgur Soup (Shorba Freek)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3689"}, {"umap_0": 2.9676406383514404, "umap_1": 10.062798500061035, "region": "Japanese", "recipe": "General Tso Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71518"}, {"umap_0": 1.05437171459198, "umap_1": 6.2700581550598145, "region": "Mexican", "recipe": "Huevos Rancheros Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80098"}, {"umap_0": 10.907604217529297, "umap_1": 7.440410614013672, "region": "French", "recipe": "Almond French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116893"}, {"umap_0": -0.578080952167511, "umap_1": 7.900142192840576, "region": "Eastern European", "recipe": "Summer Borscht (Lower Fat)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100851"}, {"umap_0": -1.1572819948196411, "umap_1": 4.556168079376221, "region": "South American", "recipe": "Delicious Ground Beef Pizza Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99097"}, {"umap_0": 11.268598556518555, "umap_1": 7.175631999969482, "region": "UK", "recipe": "Melt-In-Your-Mouth Mini Chocolate Chip Shortbread Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101554"}, {"umap_0": -0.17080117762088776, "umap_1": 9.19865608215332, "region": "Chinese and Mongolian", "recipe": "Rosemary Chicken With Wild Rice (Weight Watchers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53898"}, {"umap_0": 0.8463926911354065, "umap_1": 12.083699226379395, "region": "Indian Subcontinent", "recipe": "Chicken and Vegetable Curry (Indian)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66464"}, {"umap_0": 8.479124069213867, "umap_1": 7.198885440826416, "region": "Italian", "recipe": "Nick's Pizzelle Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121131"}, {"umap_0": 0.6675624847412109, "umap_1": 12.89366626739502, "region": "Indian Subcontinent", "recipe": "Tamarind Chutney", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66825"}, {"umap_0": 0.5457795262336731, "umap_1": 9.581137657165527, "region": "Mexican", "recipe": "Chilli Beans with Spicy Tortilla Crisps", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85660"}, {"umap_0": 1.915384292602539, "umap_1": 4.656497001647949, "region": "Mexican", "recipe": "Jen's Nachos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7425"}, {"umap_0": 2.7924530506134033, "umap_1": 10.67573070526123, "region": "Japanese", "recipe": "Green Beans with Peanut Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71725"}, {"umap_0": 1.935951828956604, "umap_1": 4.967324733734131, "region": "Mexican", "recipe": "Mexican Dinner Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81746"}, {"umap_0": -1.4189534187316895, "umap_1": 6.698885440826416, "region": "Italian", "recipe": "Mozzarella Chicken Ciabatta Sandwiches", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131675"}, {"umap_0": 10.237968444824219, "umap_1": 6.2917704582214355, "region": "Rest Africa", "recipe": "Banana Bread (from Africa!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47812"}, {"umap_0": -0.3882516920566559, "umap_1": 9.124814987182617, "region": "Italian", "recipe": "Chicken Diablo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129719"}, {"umap_0": 0.5488468408584595, "umap_1": 7.219930648803711, "region": "UK", "recipe": "Hampshire Cream Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104697"}, {"umap_0": 3.3690927028656006, "umap_1": 9.839742660522461, "region": "Chinese and Mongolian", "recipe": "Lillian's Spring Onion and Ginger Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51974"}, {"umap_0": 10.2373685836792, "umap_1": 6.3332600593566895, "region": "Caribbean", "recipe": "Coconut Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90845"}, {"umap_0": 10.785271644592285, "umap_1": 8.983062744140625, "region": "UK", "recipe": "Fizzy Jell-O", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101636"}, {"umap_0": -0.33945655822753906, "umap_1": 10.537605285644531, "region": "Greek", "recipe": "Mediterranean Veggie Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109473"}, {"umap_0": -0.36531534790992737, "umap_1": 8.661611557006836, "region": "Australian", "recipe": "Pumpkin, Sweet Potato, and Leek Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5595"}, {"umap_0": -0.5688000321388245, "umap_1": 11.917841911315918, "region": "Indian Subcontinent", "recipe": "Minty Cucumber Raita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4603"}, {"umap_0": 9.867973327636719, "umap_1": 6.5029191970825195, "region": "Irish", "recipe": "Parkin- The Gingerbread from England", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135062"}, {"umap_0": 0.6897287964820862, "umap_1": 5.2665696144104, "region": "UK", "recipe": "Ham and Cheese Breakfast Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8974"}, {"umap_0": -1.5464154481887817, "umap_1": 8.353484153747559, "region": "Deutschland", "recipe": "Truite a La Creme Zugoise (Trout in Herbed Cream Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118184"}, {"umap_0": -1.9634698629379272, "umap_1": 6.324359893798828, "region": "Greek", "recipe": "Baked Greek Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107733"}, {"umap_0": -0.2915879487991333, "umap_1": 6.246881008148193, "region": "French", "recipe": "Alsatian Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114885"}, {"umap_0": 10.76455307006836, "umap_1": 10.159624099731445, "region": "Australian", "recipe": "Cougar Kickoff Cocktail", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73552"}, {"umap_0": -0.6325548887252808, "umap_1": 8.991036415100098, "region": "Scandinavian", "recipe": "Pickled Herring Nana's Recipe (Or so Close)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106272"}, {"umap_0": 10.551515579223633, "umap_1": 10.310701370239258, "region": "Australian", "recipe": "An Aussie Sol", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77241"}, {"umap_0": 0.6943299770355225, "umap_1": 9.899022102355957, "region": "US", "recipe": "Presidential Debate Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16498"}, {"umap_0": -0.9291115999221802, "umap_1": 9.588019371032715, "region": "Indian Subcontinent", "recipe": "Bengali-Style Green Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67246"}, {"umap_0": 1.3908172845840454, "umap_1": 7.886014938354492, "region": "Deutschland", "recipe": "Badenian Sour Liver (Badisches Saures Leberle)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137908"}, {"umap_0": 3.075787305831909, "umap_1": 9.263035774230957, "region": "South American", "recipe": "Hawaiian Teriyaki Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98417"}, {"umap_0": -2.2943882942199707, "umap_1": 6.853687286376953, "region": "Middle Eastern", "recipe": "Persian Herb and Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2848"}, {"umap_0": -1.0580228567123413, "umap_1": 4.607542037963867, "region": "Italian", "recipe": "Low Carb Zucchini Lasagna (South Beach Friendly)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128064"}, {"umap_0": 9.170491218566895, "umap_1": 5.9093403816223145, "region": "South American", "recipe": "Chilean Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93789"}, {"umap_0": 3.9008381366729736, "umap_1": 10.0866060256958, "region": "Chinese and Mongolian", "recipe": "Betty Crocker Oriental Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55227"}, {"umap_0": 11.219620704650879, "umap_1": 8.545230865478516, "region": "French", "recipe": "Strawberry Mille Feuille", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116827"}, {"umap_0": 10.95186710357666, "umap_1": 7.6603803634643555, "region": "Scandinavian", "recipe": "Danish Almond Ring", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100916"}, {"umap_0": 1.7908753156661987, "umap_1": 7.278170108795166, "region": "Scandinavian", "recipe": "Time-Life Swedish Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106291"}, {"umap_0": -0.4508264362812042, "umap_1": 11.151329040527344, "region": "Italian", "recipe": "Zucchini Lentil Pasta Sauce (Vegetarian)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123242"}, {"umap_0": 11.427289962768555, "umap_1": 8.460942268371582, "region": "Australian", "recipe": "Pound Cake Truffles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73847"}, {"umap_0": 10.072717666625977, "umap_1": 5.631075859069824, "region": "Canadian", "recipe": "Cinnamon Twist Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145556"}, {"umap_0": 1.6756099462509155, "umap_1": 6.191090106964111, "region": "Australian", "recipe": "Nachos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77134"}, {"umap_0": -0.1459924429655075, "umap_1": 9.162400245666504, "region": "Middle Eastern", "recipe": "Koshary", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2619"}, {"umap_0": -1.4194878339767456, "umap_1": 10.445621490478516, "region": "Canadian", "recipe": "Lemon-Rosemary Grilled Pork Tenderloin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147583"}, {"umap_0": -1.2383641004562378, "umap_1": 7.414096355438232, "region": "Italian", "recipe": "Vegan Lasagna I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12617"}, {"umap_0": 0.786368727684021, "umap_1": 7.645626068115234, "region": "Italian", "recipe": "Italian Ranch Shredded Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133178"}, {"umap_0": 10.711524963378906, "umap_1": 10.80091381072998, "region": "Canadian", "recipe": "Tequila Grapefruit Splash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146676"}, {"umap_0": 0.46166691184043884, "umap_1": 8.206416130065918, "region": "Deutschland", "recipe": "Kraut and Sausage in a Crockpot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138219"}, {"umap_0": 1.073598861694336, "umap_1": 11.379616737365723, "region": "Mexican", "recipe": "Manchamanteles (Fruity Mole Stew)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86979"}, {"umap_0": 0.06744198501110077, "umap_1": 10.287727355957031, "region": "Rest Africa", "recipe": "Curried Okra", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47995"}, {"umap_0": -1.7383971214294434, "umap_1": 10.345659255981445, "region": "Rest Africa", "recipe": "Marinated Baby Marrow (Courgette-Zucchini)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51079"}, {"umap_0": -1.505161166191101, "umap_1": 5.65922737121582, "region": "Italian", "recipe": "Polenta With Italian Sausage and Vegetables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125166"}, {"umap_0": 1.3676795959472656, "umap_1": 8.173250198364258, "region": "Central American", "recipe": "All American Hamburgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93098"}, {"umap_0": -1.4643367528915405, "umap_1": 9.460729598999023, "region": "Australian", "recipe": "Pasta With Tuna Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5546"}, {"umap_0": 11.055620193481445, "umap_1": 6.838775634765625, "region": "French", "recipe": "Chocolate-Dipped Hazelnut Marbles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115533"}, {"umap_0": -1.540805459022522, "umap_1": 8.47479248046875, "region": "Spanish and Portuguese", "recipe": "Broccoli, Cauliflower and Romanesco Cauliflower With Warm Ham Vi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140740"}, {"umap_0": 0.7387064695358276, "umap_1": 7.098805904388428, "region": "South American", "recipe": "Individual Beef Wellingtons II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96313"}, {"umap_0": 1.9130972623825073, "umap_1": 5.743204593658447, "region": "Mexican", "recipe": "Salisbury Steak - Tex Mex Way", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85636"}, {"umap_0": 11.582109451293945, "umap_1": 7.7360405921936035, "region": "French", "recipe": "French Waffles with Strawberries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113980"}, {"umap_0": 11.66311264038086, "umap_1": 9.274120330810547, "region": "Rest Africa", "recipe": "South African Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47795"}, {"umap_0": -0.5895673632621765, "umap_1": 10.910447120666504, "region": "Mexican", "recipe": "Fish and Veggie Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84844"}, {"umap_0": 9.877508163452148, "umap_1": 9.326733589172363, "region": "Canadian", "recipe": "Apple-Apricot Sauce for Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148556"}, {"umap_0": 1.8734489679336548, "umap_1": 5.978754997253418, "region": "Mexican", "recipe": "Hearty Chicken Enchilada Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88588"}, {"umap_0": -2.0421183109283447, "umap_1": 9.039802551269531, "region": "South American", "recipe": "Beef Adobo Salpicao", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96907"}, {"umap_0": 2.3681581020355225, "umap_1": 10.772686004638672, "region": "Southeast Asian", "recipe": "Vegetarian Pho (Vietnamese Noodle Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3772"}, {"umap_0": -1.0620571374893188, "umap_1": 6.112938404083252, "region": "Italian", "recipe": "Zesty Fried Meatballs With Red Pepper Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132783"}, {"umap_0": 9.747696876525879, "umap_1": 7.595332622528076, "region": "Australian", "recipe": "Lemon or Lime and Macadamia Biscotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76870"}, {"umap_0": 10.19006061553955, "umap_1": 8.046283721923828, "region": "UK", "recipe": "Thai Baked Custard", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101321"}, {"umap_0": 9.616816520690918, "umap_1": 7.417548656463623, "region": "Scandinavian", "recipe": "Danish Pancakes W- Tart Strawberries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101145"}, {"umap_0": -1.9307607412338257, "umap_1": 7.019047260284424, "region": "Italian", "recipe": "Italian Veggie Skillet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120436"}, {"umap_0": 0.5598496198654175, "umap_1": 10.123669624328613, "region": "Mexican", "recipe": "Chicken Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88476"}, {"umap_0": 9.663139343261719, "umap_1": 7.648581504821777, "region": "UK", "recipe": "Scottish Tablet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101615"}, {"umap_0": 2.9634034633636475, "umap_1": 9.608389854431152, "region": "Southeast Asian", "recipe": "Rumaki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3929"}, {"umap_0": 0.2709687650203705, "umap_1": 12.72960376739502, "region": "Indian Subcontinent", "recipe": "Eggplant Bharta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67573"}, {"umap_0": 10.224851608276367, "umap_1": 9.328141212463379, "region": "Canadian", "recipe": "Nat's Peach Topping", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146305"}, {"umap_0": -1.7756894826889038, "umap_1": 9.381627082824707, "region": "Australian", "recipe": "5-Minute Low-Fat Garlic Prawns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75100"}, {"umap_0": 1.8640124797821045, "umap_1": 4.824682235717773, "region": "Mexican", "recipe": "Texas Tortilla Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79996"}, {"umap_0": -2.374586343765259, "umap_1": 9.79255485534668, "region": "French", "recipe": "Classic  Ravigote  Dressing for Tossed Salads", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113239"}, {"umap_0": 10.879724502563477, "umap_1": 8.046968460083008, "region": "French", "recipe": "Amazing French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113011"}, {"umap_0": -1.6272448301315308, "umap_1": 7.297032356262207, "region": "Spanish and Portuguese", "recipe": "Spanish Omelet With Potatoes and Chorizo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141445"}, {"umap_0": 7.513177871704102, "umap_1": 6.72554874420166, "region": "Italian", "recipe": "Basic Tortellini Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122105"}, {"umap_0": 10.540383338928223, "umap_1": 7.636196136474609, "region": "UK", "recipe": "Perfect Lemon Curd", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8778"}, {"umap_0": 0.3487512469291687, "umap_1": 10.458357810974121, "region": "Mexican", "recipe": "Mark's Low Fat Chicken Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89434"}, {"umap_0": 11.746549606323242, "umap_1": 7.831028461456299, "region": "French", "recipe": "French Silk Fudge", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115854"}, {"umap_0": -0.45325809717178345, "umap_1": 8.007261276245117, "region": "Deutschland", "recipe": "Sp\u00e4tzle Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138781"}, {"umap_0": -1.7935446500778198, "umap_1": 10.381575584411621, "region": "Spanish and Portuguese", "recipe": "Shrimps Fried With Garlic", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141178"}, {"umap_0": 11.02805233001709, "umap_1": 7.780513286590576, "region": "US", "recipe": "Quick Sticky Buns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14052"}, {"umap_0": -2.161686897277832, "umap_1": 10.18532943725586, "region": "Spanish and Portuguese", "recipe": "Cauliflower Al Ajoarriero - Spanish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141897"}, {"umap_0": -0.6048974394798279, "umap_1": 10.580102920532227, "region": "Mexican", "recipe": "Vince's Chicken Tortilla Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88234"}, {"umap_0": 0.6900504231452942, "umap_1": 9.686701774597168, "region": "Mexican", "recipe": "Easy &amp; Tasty Chicken Tortilla Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80075"}, {"umap_0": 0.21998491883277893, "umap_1": 8.772449493408203, "region": "Spanish and Portuguese", "recipe": "Spanish Turkey Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140886"}, {"umap_0": -2.073589324951172, "umap_1": 8.879148483276367, "region": "Italian", "recipe": "Anchovy-Olive Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125951"}, {"umap_0": 9.847919464111328, "umap_1": 6.562987327575684, "region": "US", "recipe": "Apple Stack Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16304"}, {"umap_0": 9.165212631225586, "umap_1": 7.350872039794922, "region": "UK", "recipe": "Scottish Tablet II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101877"}, {"umap_0": 10.443743705749512, "umap_1": 8.475082397460938, "region": "UK", "recipe": "Turkish Delight", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8736"}, {"umap_0": 2.368593692779541, "umap_1": 9.768277168273926, "region": "Thai", "recipe": "Linda's Thai  Sweet Chili  Sauce for Dipping (Egg Rolls, Sushi)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58370"}, {"umap_0": 10.211921691894531, "umap_1": 6.171392440795898, "region": "Scandinavian", "recipe": "Danish Brovst Dream Coffee Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101008"}, {"umap_0": -0.5999118685722351, "umap_1": 11.400858879089355, "region": "Caribbean", "recipe": "Jerk Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90512"}, {"umap_0": -2.055128574371338, "umap_1": 7.81612491607666, "region": "Italian", "recipe": "Brent's Favorite Clam Linguine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123806"}, {"umap_0": 0.3431191146373749, "umap_1": 5.240170955657959, "region": "South American", "recipe": "Beef and Macaroni Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99207"}, {"umap_0": 0.5555034875869751, "umap_1": 9.143943786621094, "region": "Caribbean", "recipe": "Puerto Rican Macaroni Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91091"}, {"umap_0": 1.4609586000442505, "umap_1": 9.906042098999023, "region": "Deutschland", "recipe": "Vegan Octoberfest Weiners(Brats)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138985"}, {"umap_0": -0.4465193450450897, "umap_1": 8.947714805603027, "region": "US", "recipe": "Toddly Man's Big House Black-Eyed Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14044"}, {"umap_0": 2.6515583992004395, "umap_1": 9.226421356201172, "region": "Chinese and Mongolian", "recipe": "Chow Yuk", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57173"}, {"umap_0": 2.7805960178375244, "umap_1": 9.73937702178955, "region": "Thai", "recipe": "Peanut-Ginger Tofu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57570"}, {"umap_0": 1.560707926750183, "umap_1": 11.354384422302246, "region": "Japanese", "recipe": "Scallops with Wasabi and Ginger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70728"}, {"umap_0": -1.5759351253509521, "umap_1": 11.272733688354492, "region": "Middle Eastern", "recipe": "Huh Huh Huh Huhmmos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60752"}, {"umap_0": -0.34053176641464233, "umap_1": 5.331582546234131, "region": "Italian", "recipe": "Ham and Fresh Basil Pinwheels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12191"}, {"umap_0": 0.41087138652801514, "umap_1": 5.5249924659729, "region": "South American", "recipe": "Beef and Asparagus Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96293"}, {"umap_0": 1.48582923412323, "umap_1": 6.456732273101807, "region": "Mexican", "recipe": "Southwest Chicken Skillet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89464"}, {"umap_0": 0.139811173081398, "umap_1": 7.923862457275391, "region": "Spanish and Portuguese", "recipe": "Portuguese Chicken &amp; Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119123"}, {"umap_0": -0.508108377456665, "umap_1": 7.330385684967041, "region": "UK", "recipe": "English Garden Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103962"}, {"umap_0": 9.731258392333984, "umap_1": 8.184741973876953, "region": "Caribbean", "recipe": "Sugar Cake (Trinidad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91003"}, {"umap_0": -1.5692862272262573, "umap_1": 8.892705917358398, "region": "Canadian", "recipe": "Blueberry Mustard", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/149052"}, {"umap_0": -0.2067275047302246, "umap_1": 6.128017902374268, "region": "Mexican", "recipe": "Ultimate Meal Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87422"}, {"umap_0": 1.6260857582092285, "umap_1": 7.823522567749023, "region": "UK", "recipe": "Easy Shepherd's Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104705"}, {"umap_0": 11.417586326599121, "umap_1": 8.337646484375, "region": "Australian", "recipe": "Nanaimo Bars", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77697"}, {"umap_0": 10.800504684448242, "umap_1": 9.343948364257812, "region": "Australian", "recipe": "Strawberry-Papaya Shake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74250"}, {"umap_0": 10.956401824951172, "umap_1": 6.908598899841309, "region": "Scandinavian", "recipe": "Danish Cream Cheese Wreath", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101146"}, {"umap_0": 1.3242301940917969, "umap_1": 12.063570976257324, "region": "Indian Subcontinent", "recipe": "Peanut Tikki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68247"}, {"umap_0": -0.25661998987197876, "umap_1": 6.924753665924072, "region": "Italian", "recipe": "Italian Fried Steak With Roasted Pepper Pesto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131500"}, {"umap_0": 1.110457181930542, "umap_1": 11.157255172729492, "region": "Indian Subcontinent", "recipe": "Littlemafia's Fish Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64941"}, {"umap_0": 11.470293998718262, "umap_1": 8.441516876220703, "region": "Mexican", "recipe": "Mexican Hot Fudge Sundae", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82498"}, {"umap_0": -1.3576542139053345, "umap_1": 11.660940170288086, "region": "Mexican", "recipe": "Sarah's Fresh Picante Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81688"}, {"umap_0": 10.915353775024414, "umap_1": 10.356585502624512, "region": "Australian", "recipe": "Lychee Champagne Cocktail", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73037"}, {"umap_0": 10.455739974975586, "umap_1": 8.930646896362305, "region": "UK", "recipe": "Low Fat Tropical Prawn and Carrot Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102553"}, {"umap_0": 11.398619651794434, "umap_1": 7.977812767028809, "region": "Australian", "recipe": "Simple Vanilla Yoghurt Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76844"}, {"umap_0": 0.4924359619617462, "umap_1": 5.446086883544922, "region": "South American", "recipe": "Quick Barbecue Beef Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99811"}, {"umap_0": 1.6475183963775635, "umap_1": 8.434165000915527, "region": "South American", "recipe": "Slow-Roasted Beef(ATK)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94847"}, {"umap_0": -0.15738041698932648, "umap_1": 8.916122436523438, "region": "Italian", "recipe": "Savory Italian Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125337"}, {"umap_0": -1.2619378566741943, "umap_1": 11.13453483581543, "region": "Middle Eastern", "recipe": "Lebanese Spiced Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63120"}, {"umap_0": 2.0212368965148926, "umap_1": 5.689796447753906, "region": "Mexican", "recipe": "Easy Mexican Layered Nacho Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89868"}, {"umap_0": 1.2904988527297974, "umap_1": 5.878085613250732, "region": "Mexican", "recipe": "Creamy Green Chili, Chicken, &amp; Corn Enchilada-Style Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86697"}, {"umap_0": -0.8889918327331543, "umap_1": 5.327736854553223, "region": "Italian", "recipe": "Pasta House Pasta Con Broccoli (Actual Recipe)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120430"}, {"umap_0": 11.01464557647705, "umap_1": 6.211829662322998, "region": "UK", "recipe": "Traditional Buttermilk Cake (And Optional Frosting Recipe)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103574"}, {"umap_0": -0.33717992901802063, "umap_1": 12.258980751037598, "region": "Indian Subcontinent", "recipe": "Indian Spiced Shrimp Wrap", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64408"}, {"umap_0": 1.343888521194458, "umap_1": 5.1833953857421875, "region": "Canadian", "recipe": "Brown Rice With Chiles for Two", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148172"}, {"umap_0": 1.3319039344787598, "umap_1": 8.557623863220215, "region": "Canadian", "recipe": "Layered Cabbage Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144599"}, {"umap_0": 2.9133336544036865, "umap_1": 9.388020515441895, "region": "Japanese", "recipe": "Asian \"Chicken\" Noodle Soup(Vegetarian)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70197"}, {"umap_0": -0.329262912273407, "umap_1": 10.784530639648438, "region": "Mexican", "recipe": "Mexican Beans-Frijoles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78614"}, {"umap_0": -0.19194583594799042, "umap_1": 7.180932998657227, "region": "French", "recipe": "Veal Forestier", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115404"}, {"umap_0": 9.63549518585205, "umap_1": 7.940674781799316, "region": "UK", "recipe": "Courting Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101532"}, {"umap_0": 10.337065696716309, "umap_1": 9.078323364257812, "region": "UK", "recipe": "Festive Sugar Plums - Old Fashioned Sweetmeats", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101530"}, {"umap_0": -0.7901566028594971, "umap_1": 10.480709075927734, "region": "Spanish and Portuguese", "recipe": "Cacoila (Portuguese Stewed Pork)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119208"}, {"umap_0": 1.8778561353683472, "umap_1": 5.939470291137695, "region": "Mexican", "recipe": "Skillet Steak Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85304"}, {"umap_0": 2.9359843730926514, "umap_1": 9.977510452270508, "region": "Chinese and Mongolian", "recipe": "Chinese Country Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56381"}, {"umap_0": 1.198319673538208, "umap_1": 7.573212623596191, "region": "South American", "recipe": "Grammom's Beef Stroganoff", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94261"}, {"umap_0": 10.64462661743164, "umap_1": 9.02371883392334, "region": "French", "recipe": "Strawberry Jam  Essence De Provence", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112218"}, {"umap_0": -1.8625191450119019, "umap_1": 9.217657089233398, "region": "Italian", "recipe": "Italian Sausages With Cabbage and Onion", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125563"}, {"umap_0": 2.388711929321289, "umap_1": 10.993955612182617, "region": "Scandinavian", "recipe": "Creamy Horseradish Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106001"}, {"umap_0": 0.7922235727310181, "umap_1": 13.021902084350586, "region": "Indian Subcontinent", "recipe": "Goan Beef Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65178"}, {"umap_0": -2.024712085723877, "umap_1": 7.572224140167236, "region": "Greek", "recipe": "Greek Goddess Wrap-Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107704"}, {"umap_0": 3.0079281330108643, "umap_1": 9.619030952453613, "region": "Chinese and Mongolian", "recipe": "Chinese Gu Lao Rou", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52712"}, {"umap_0": -2.506399631500244, "umap_1": 7.299046039581299, "region": "Greek", "recipe": "Ouzo Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108886"}, {"umap_0": 10.597884178161621, "umap_1": 5.756188869476318, "region": "Canadian", "recipe": "Black and White Cheesecake Squares:  It's a Martha Good Thing!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144639"}, {"umap_0": -0.9539880752563477, "umap_1": 7.705226898193359, "region": "Italian", "recipe": "Jumbo Shrimp and Zucchini Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122971"}, {"umap_0": 10.44797420501709, "umap_1": 8.318530082702637, "region": "Italian", "recipe": "Risotto Al Cioccolato (Chocolate Risotto)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121062"}, {"umap_0": 9.988883972167969, "umap_1": 9.451470375061035, "region": "Canadian", "recipe": "Orange Beets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145411"}, {"umap_0": -1.718770980834961, "umap_1": 9.141714096069336, "region": "French", "recipe": "Restaurant Plaisir-Sante Bouillabaisse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115152"}, {"umap_0": 10.750125885009766, "umap_1": 6.471128940582275, "region": "UK", "recipe": "Sponge Vannilla Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103715"}, {"umap_0": 9.887571334838867, "umap_1": 6.101161003112793, "region": "Canadian", "recipe": "Country Cookies (Oatmeal-Blueberry)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148802"}, {"umap_0": -0.44785696268081665, "umap_1": 9.105504035949707, "region": "Scandinavian", "recipe": "Danish Rollmops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101063"}, {"umap_0": 10.67367935180664, "umap_1": 10.827676773071289, "region": "Mexican", "recipe": "Mango Margaritas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81251"}, {"umap_0": 2.5616462230682373, "umap_1": 10.654887199401855, "region": "Thai", "recipe": "Tho Ruung Pad Thai", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59377"}, {"umap_0": 1.7911100387573242, "umap_1": 10.399070739746094, "region": "Australian", "recipe": "French Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76360"}, {"umap_0": 2.9168143272399902, "umap_1": 9.998456954956055, "region": "Thai", "recipe": "Sweet Chili Chicken With Asian Vegetable Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58869"}, {"umap_0": 0.14831365644931793, "umap_1": 11.140542984008789, "region": "Middle Eastern", "recipe": "Spicy Chicken Breasts With Tahini-Yogurt Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61977"}, {"umap_0": -0.19852854311466217, "umap_1": 11.588016510009766, "region": "Northern Africa", "recipe": "Moroccan Chicken Thighs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50532"}, {"umap_0": 0.33052781224250793, "umap_1": 9.658503532409668, "region": "French", "recipe": "Oven Fried Chicken and French Fries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117202"}, {"umap_0": -0.28073740005493164, "umap_1": 11.500073432922363, "region": "Northern Africa", "recipe": "Moroccan-Spiced Crab Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50157"}, {"umap_0": -1.7667112350463867, "umap_1": 9.53663444519043, "region": "Italian", "recipe": "Fried Tomato, Onion, and Mushroom Ragout", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10709"}, {"umap_0": -1.4612549543380737, "umap_1": 11.424036026000977, "region": "Middle Eastern", "recipe": "Claudia Roden's Hummus Bi Tahina (Chickpea and Sesame Dip)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63247"}, {"umap_0": 1.5612090826034546, "umap_1": 7.46037483215332, "region": "Scandinavian", "recipe": "Viking Meatball on a Stick - Copycat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107215"}, {"umap_0": -1.7705425024032593, "umap_1": 7.683483600616455, "region": "Italian", "recipe": "Sausage, Zucchini, Spinach and Tomato Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128289"}, {"umap_0": 10.590760231018066, "umap_1": 6.590651988983154, "region": "Rest Africa", "recipe": "South African Melktert or Milk Tart (Custard Pie)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48897"}, {"umap_0": 0.8370258212089539, "umap_1": 6.745672702789307, "region": "South American", "recipe": "Brazilian Corn and Shrimp Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93479"}, {"umap_0": 10.38011360168457, "umap_1": 7.146824836730957, "region": "French", "recipe": "Bittersweet Chocolate Truffles Rolled in Spices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112880"}, {"umap_0": 10.54295825958252, "umap_1": 9.112892150878906, "region": "Canadian", "recipe": "Snow Covered Apples", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/149151"}, {"umap_0": 11.463234901428223, "umap_1": 7.6804304122924805, "region": "Mexican", "recipe": "Sopapilla Cheesecake Dessert", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6263"}, {"umap_0": 0.7639448642730713, "umap_1": 12.973309516906738, "region": "Indian Subcontinent", "recipe": "Chaat Masala", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72309"}, {"umap_0": -0.8754989504814148, "umap_1": 7.376551628112793, "region": "Scandinavian", "recipe": "Elisabeth's Garlic Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140058"}, {"umap_0": 10.680970191955566, "umap_1": 6.97568941116333, "region": "UK", "recipe": "Traditional Bakewell Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8976"}, {"umap_0": 0.49108174443244934, "umap_1": 5.727354049682617, "region": "Canadian", "recipe": "Yummy Baked Potato Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142690"}, {"umap_0": 0.5761415362358093, "umap_1": 8.276139259338379, "region": "South American", "recipe": "Tender Brandied Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95496"}, {"umap_0": 0.8719810843467712, "umap_1": 7.818314075469971, "region": "Canadian", "recipe": "Meatballs Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148217"}, {"umap_0": 10.302268028259277, "umap_1": 8.12131118774414, "region": "Japanese", "recipe": "Green Tea Mousse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70456"}, {"umap_0": 0.2910577058792114, "umap_1": 7.4747114181518555, "region": "Deutschland", "recipe": "Midwestern German Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138691"}, {"umap_0": 1.6556390523910522, "umap_1": 6.171474456787109, "region": "Mexican", "recipe": "Healthy Black Bean Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82771"}, {"umap_0": -1.6506705284118652, "umap_1": 9.32905387878418, "region": "Italian", "recipe": "Italian Pepper Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120466"}, {"umap_0": -1.331217885017395, "umap_1": 11.124466896057129, "region": "Spanish and Portuguese", "recipe": "Saffron Butter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140316"}, {"umap_0": 0.39597684144973755, "umap_1": 10.759122848510742, "region": "Mexican", "recipe": "Crispy Slow Cooker Carnitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82909"}, {"umap_0": 0.8575361371040344, "umap_1": 9.894878387451172, "region": "Indian Subcontinent", "recipe": "Tigua Indian Definitive Bowl of Red", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69042"}, {"umap_0": -2.281219959259033, "umap_1": 9.666858673095703, "region": "French", "recipe": "Classic French Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116490"}, {"umap_0": 11.849913597106934, "umap_1": 8.119468688964844, "region": "Canadian", "recipe": "Oreo Cookie Drops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144981"}, {"umap_0": -0.9227136373519897, "umap_1": 7.5815348625183105, "region": "Caribbean", "recipe": "Crabmeat St. Barts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91351"}, {"umap_0": 9.55331802368164, "umap_1": 9.761798858642578, "region": "Deutschland", "recipe": "Pickled Red Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138042"}, {"umap_0": 0.2414015382528305, "umap_1": 10.88632583618164, "region": "Middle Eastern", "recipe": "Chicken with Bulgur", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61449"}, {"umap_0": 0.9229373335838318, "umap_1": 9.335219383239746, "region": "UK", "recipe": "Lorne Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101439"}, {"umap_0": -0.6003052592277527, "umap_1": 7.396515846252441, "region": "Caribbean", "recipe": "Ensalada De Camarones (Puerto Rican Shrimp Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92180"}, {"umap_0": 2.4931068420410156, "umap_1": 8.570638656616211, "region": "Southeast Asian", "recipe": "KARI-KARE (Meat and Vegetable Stew in Peanut Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63602"}, {"umap_0": 2.6950314044952393, "umap_1": 9.49687671661377, "region": "Chinese and Mongolian", "recipe": "Pepper Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53142"}, {"umap_0": 9.711673736572266, "umap_1": 7.068781852722168, "region": "Greek", "recipe": "Greek Honey Cookies (Melomakarona)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107831"}, {"umap_0": 0.9879035353660583, "umap_1": 8.370538711547852, "region": "Australian", "recipe": "Steak Bolshio", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73170"}, {"umap_0": 2.9836626052856445, "umap_1": 8.801194190979004, "region": "Indian Subcontinent", "recipe": "Pigeon Pea Paste (Aka Arhar Dal or Toovar Dal)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67008"}, {"umap_0": -1.7543110847473145, "umap_1": 5.666971683502197, "region": "Italian", "recipe": "Italian Zucchini Skillet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125406"}, {"umap_0": -0.6595043540000916, "umap_1": 11.750669479370117, "region": "Middle Eastern", "recipe": "Pistachio Quinoa Pilaf With Harissa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61588"}, {"umap_0": 8.21139144897461, "umap_1": 6.102329254150391, "region": "Canadian", "recipe": "Pastry With No Trans Fat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147422"}, {"umap_0": 1.2604190111160278, "umap_1": 12.577383041381836, "region": "Indian Subcontinent", "recipe": "Dal Maharani (The Queen of Indian Lentils!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64178"}, {"umap_0": -0.9993788599967957, "umap_1": 8.276711463928223, "region": "Australian", "recipe": "Roast Potato Tortilla", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74008"}, {"umap_0": 9.758047103881836, "umap_1": 6.126584529876709, "region": "UK", "recipe": "Cornish Caramel Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102915"}, {"umap_0": 0.9219141006469727, "umap_1": 8.323060035705566, "region": "Caribbean", "recipe": "Cuban-Greek Chicken and Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92689"}, {"umap_0": 9.570618629455566, "umap_1": 9.572281837463379, "region": "Mexican", "recipe": "Margaritas to Die For", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6604"}, {"umap_0": -2.5713889598846436, "umap_1": 8.20828914642334, "region": "Greek", "recipe": "Spiced Cheese Spread -  (Tirokafteri)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110817"}, {"umap_0": -0.10510776191949844, "umap_1": 10.881714820861816, "region": "Canadian", "recipe": "Southern Style Greens", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148372"}, {"umap_0": 7.527230739593506, "umap_1": 6.731996059417725, "region": "Canadian", "recipe": "Sunny's Pizza Dough", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144229"}, {"umap_0": -0.736492395401001, "umap_1": 7.209438800811768, "region": "Italian", "recipe": "Risotto with Chicken and Asparagus", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11636"}, {"umap_0": 1.2257952690124512, "umap_1": 5.957526206970215, "region": "Mexican", "recipe": "Bite-Size Taco Turnovers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84583"}, {"umap_0": 2.9348320960998535, "umap_1": 10.314735412597656, "region": "Thai", "recipe": "Thai Wonton Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59127"}, {"umap_0": 0.8653531074523926, "umap_1": 12.906578063964844, "region": "Indian Subcontinent", "recipe": "Best Potatoes Ever!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4454"}, {"umap_0": -1.2849581241607666, "umap_1": 8.195091247558594, "region": "Mexican", "recipe": "Anne's Calabacitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6228"}, {"umap_0": 1.0773259401321411, "umap_1": 12.934423446655273, "region": "Indian Subcontinent", "recipe": "Baingan Ka Bharta (Indian-Punjabi Style)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67074"}, {"umap_0": 10.528319358825684, "umap_1": 8.996499061584473, "region": "Rest Africa", "recipe": "Banana Soup!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47881"}, {"umap_0": -0.8595709800720215, "umap_1": 11.71829891204834, "region": "Mexican", "recipe": "Smoky, Spicy Tomatillo Salsa Verde Aka Green Hell!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85722"}, {"umap_0": 0.9189432263374329, "umap_1": 12.381942749023438, "region": "Indian Subcontinent", "recipe": "Veggie Delite Masala Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64750"}, {"umap_0": -1.2338111400604248, "umap_1": 5.3467559814453125, "region": "Greek", "recipe": "Spinach &amp; Feta Baked 'Omelette'", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110374"}, {"umap_0": 1.4835739135742188, "umap_1": 12.297212600708008, "region": "Indian Subcontinent", "recipe": "Pilaf-Style Basmati Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63801"}, {"umap_0": 11.293014526367188, "umap_1": 8.827263832092285, "region": "Canadian", "recipe": "Sauteed Peaches with Vanilla Ice Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147888"}, {"umap_0": 0.063170425593853, "umap_1": 11.672297477722168, "region": "Northern Africa", "recipe": "Eggplant (Aubergine) Tagine - Low Fat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50284"}, {"umap_0": 0.11022394150495529, "umap_1": 7.659482955932617, "region": "Canadian", "recipe": "Venison Chops With Mushroom Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145179"}, {"umap_0": -0.1398801952600479, "umap_1": 10.03643798828125, "region": "Mexican", "recipe": "Chicken Fajitas With Grilled Peppers and Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89478"}, {"umap_0": 2.0507731437683105, "umap_1": 11.173535346984863, "region": "Indian Subcontinent", "recipe": "Green Fish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72236"}, {"umap_0": 11.396681785583496, "umap_1": 10.151305198669434, "region": "Rest Africa", "recipe": "Slippery Nipple", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51189"}, {"umap_0": -2.499584674835205, "umap_1": 7.4646430015563965, "region": "Greek", "recipe": "Crunchy Feta &amp; Tomato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110859"}, {"umap_0": -1.273516058921814, "umap_1": 6.4045233726501465, "region": "Italian", "recipe": "Dom's Linguine With Mushrooms &amp; Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123886"}, {"umap_0": 9.12509822845459, "umap_1": 8.28995132446289, "region": "Thai", "recipe": "Quick Thai Tea For Two", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58553"}, {"umap_0": 3.5569608211517334, "umap_1": 9.750225067138672, "region": "Chinese and Mongolian", "recipe": "Spicy Chinese Stir Fry Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55307"}, {"umap_0": -1.1379667520523071, "umap_1": 7.447230815887451, "region": "Italian", "recipe": "Italian Chicken, Potatoes, and Squash Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120246"}, {"umap_0": 3.3748135566711426, "umap_1": 10.229393005371094, "region": "Chinese and Mongolian", "recipe": "Spicy Eggplant and Green Bean Stir Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52254"}, {"umap_0": 7.55553674697876, "umap_1": 6.746270656585693, "region": "Italian", "recipe": "THE Easiest Pizza Crust", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129779"}, {"umap_0": 10.856467247009277, "umap_1": 7.206081867218018, "region": "Deutschland", "recipe": "German Baked Cheesecake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138267"}, {"umap_0": -1.17917799949646, "umap_1": 11.33971881866455, "region": "Northern Africa", "recipe": "Swordfish With Chermoula", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50217"}, {"umap_0": 10.25863265991211, "umap_1": 7.950075149536133, "region": "Mexican", "recipe": "Margarita Mix", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80427"}, {"umap_0": 0.7936912178993225, "umap_1": 5.736952304840088, "region": "Indian Subcontinent", "recipe": "Mint and Cottage Cheese Samosas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72110"}, {"umap_0": 0.012634641490876675, "umap_1": 8.352033615112305, "region": "Eastern European", "recipe": "Studzienina - Polish Head Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133689"}, {"umap_0": -0.5981330275535583, "umap_1": 10.241686820983887, "region": "Mexican", "recipe": "Mexican Avocado and Hearts of Palm Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84875"}, {"umap_0": 9.99521541595459, "umap_1": 7.086462020874023, "region": "Caribbean", "recipe": "Jamaican Easter Bun", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92052"}, {"umap_0": 10.75131607055664, "umap_1": 10.151999473571777, "region": "Australian", "recipe": "Passion Fruit and Lychee Martini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74195"}, {"umap_0": 0.18940070271492004, "umap_1": 5.097394943237305, "region": "Deutschland", "recipe": "Swiss and Ham Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117435"}, {"umap_0": 11.024012565612793, "umap_1": 6.953031063079834, "region": "Canadian", "recipe": "Chocolate Butterscotch Ripple Squares", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145352"}, {"umap_0": 4.343148708343506, "umap_1": 9.620804786682129, "region": "Japanese", "recipe": "Godzilla Roll - Sushi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71954"}, {"umap_0": 0.998953640460968, "umap_1": 11.832167625427246, "region": "Indian Subcontinent", "recipe": "Tandoori Chicken Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4119"}, {"umap_0": 3.1372594833374023, "umap_1": 9.517512321472168, "region": "Chinese and Mongolian", "recipe": "Moo Goo Gai Pan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55899"}, {"umap_0": 2.5337212085723877, "umap_1": 9.578829765319824, "region": "Chinese and Mongolian", "recipe": "Foil-Wrapped Ginger Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56223"}, {"umap_0": 1.7920453548431396, "umap_1": 4.7065324783325195, "region": "Mexican", "recipe": "Ole' Chicken Enchilada Lovers Casserole ( Ww )", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88902"}, {"umap_0": -1.2037229537963867, "umap_1": 4.891397953033447, "region": "Italian", "recipe": "Italian Parmesan Baked Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133203"}, {"umap_0": 1.9315327405929565, "umap_1": 10.988516807556152, "region": "Southeast Asian", "recipe": "Dipping Sauce With Ginger (Nuoc Cham Gung)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62463"}, {"umap_0": 1.1667430400848389, "umap_1": 7.870398044586182, "region": "UK", "recipe": "All Day Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102109"}, {"umap_0": 8.996176719665527, "umap_1": 7.8659186363220215, "region": "Indian Subcontinent", "recipe": "Besan Laddu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4138"}, {"umap_0": -1.598083734512329, "umap_1": 6.080306053161621, "region": "Italian", "recipe": "Italian Savory Torte", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133012"}, {"umap_0": -1.3958945274353027, "umap_1": 7.6432061195373535, "region": "Italian", "recipe": "Orzo With Lemon, Garlic, Parmigiano &amp; Herbs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120145"}, {"umap_0": 3.0398051738739014, "umap_1": 10.739788055419922, "region": "Thai", "recipe": "Thai Chili Sauce Salmon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58266"}, {"umap_0": 1.2017189264297485, "umap_1": 6.279118061065674, "region": "South American", "recipe": "Cheesy Nacho's With Spicy Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97679"}, {"umap_0": 11.094701766967773, "umap_1": 8.38327407836914, "region": "Eastern European", "recipe": "Nesselrode Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100726"}, {"umap_0": -1.795520544052124, "umap_1": 6.056402683258057, "region": "Italian", "recipe": "Delicious and Simple Ricotta and Tomato Pasta Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126934"}, {"umap_0": 10.636688232421875, "umap_1": 10.265210151672363, "region": "Australian", "recipe": "Mango Sorbet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5553"}, {"umap_0": 10.745750427246094, "umap_1": 7.198062419891357, "region": "French", "recipe": "Coconut French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115081"}, {"umap_0": 8.843927383422852, "umap_1": 7.3433756828308105, "region": "Indian Subcontinent", "recipe": "How to Make 'tofu' - a Simple Homemade Recipe!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68688"}, {"umap_0": 0.6295729279518127, "umap_1": 6.880488872528076, "region": "UK", "recipe": "Creamed SweetBreads", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102836"}, {"umap_0": -2.3599414825439453, "umap_1": 8.011959075927734, "region": "Italian", "recipe": "Rigatoni With Red Pepper, Almonds and Bread Crumbs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125060"}, {"umap_0": -0.7407087683677673, "umap_1": 11.649407386779785, "region": "Rest Africa", "recipe": "Kenya Fish Baked With Tomatoes and Spices (Mtuzi Wa Samaki)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48668"}, {"umap_0": 10.723774909973145, "umap_1": 8.147116661071777, "region": "Caribbean", "recipe": "Mango Coconut Bread Pudding With Rum Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91904"}, {"umap_0": 0.31455355882644653, "umap_1": 5.593790531158447, "region": "South American", "recipe": "Horseradish Beef Strudel", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98011"}, {"umap_0": -1.9179109334945679, "umap_1": 10.400378227233887, "region": "Middle Eastern", "recipe": "Dead Sea Babaganoush", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63220"}, {"umap_0": 3.466909885406494, "umap_1": 7.669651508331299, "region": "Mexican", "recipe": "Mexican Taquitos (Tuna)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79884"}, {"umap_0": 9.628279685974121, "umap_1": 8.461490631103516, "region": "Northern Africa", "recipe": "Mhalbi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50972"}, {"umap_0": 0.8521121144294739, "umap_1": 12.55677318572998, "region": "Indian Subcontinent", "recipe": "Indian Spiced Sweet Potato Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68067"}, {"umap_0": 10.216424942016602, "umap_1": 5.989542484283447, "region": "Australian", "recipe": "Banana Slice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74703"}, {"umap_0": 2.761542797088623, "umap_1": 10.280957221984863, "region": "Thai", "recipe": "Amanda's Thai Peanut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58165"}, {"umap_0": 9.377599716186523, "umap_1": 7.761491298675537, "region": "Belgian", "recipe": "Appelflappen", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134008"}, {"umap_0": 11.440287590026855, "umap_1": 8.94237995147705, "region": "Irish", "recipe": "Cool Whip Jello", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136515"}, {"umap_0": 11.266894340515137, "umap_1": 7.430263042449951, "region": "French", "recipe": "Streusel Topped French Toast Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114049"}, {"umap_0": -0.9124829173088074, "umap_1": 4.1500468254089355, "region": "Italian", "recipe": "Kat's Slow-Cooker Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125045"}, {"umap_0": -0.5463264584541321, "umap_1": 10.847198486328125, "region": "Caribbean", "recipe": "Grilled Chicken and Plantains, Jamaican-Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91854"}, {"umap_0": 2.5161972045898438, "umap_1": 9.62346363067627, "region": "Indian Subcontinent", "recipe": "Chicken Broccoli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66718"}, {"umap_0": 0.3292330205440521, "umap_1": 10.66638469696045, "region": "Mexican", "recipe": "South Mexican Frijoles De Olla", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77808"}, {"umap_0": 0.8273971080780029, "umap_1": 7.926181316375732, "region": "Deutschland", "recipe": "Gypsy Schnitzel", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137117"}, {"umap_0": 0.7331476211547852, "umap_1": 4.986757278442383, "region": "Eastern European", "recipe": "Ham and Au Gratin Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133873"}, {"umap_0": 9.699238777160645, "umap_1": 7.069442272186279, "region": "UK", "recipe": "Cranberry-Raisin Scones", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103827"}, {"umap_0": 0.4987710416316986, "umap_1": 11.757622718811035, "region": "Australian", "recipe": "Kumera and Tuna Patties", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73908"}, {"umap_0": 0.42196187376976013, "umap_1": 10.087667465209961, "region": "Mexican", "recipe": "My Turkey Tortilla Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82631"}, {"umap_0": 11.472017288208008, "umap_1": 7.991462230682373, "region": "Australian", "recipe": "Chocolate and Ginger Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73055"}, {"umap_0": -0.32733985781669617, "umap_1": 10.939979553222656, "region": "Mexican", "recipe": "Guacamole Cheese Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79129"}, {"umap_0": 10.394046783447266, "umap_1": 9.020578384399414, "region": "Canadian", "recipe": "Cranberry Orange and Honey Spread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143435"}, {"umap_0": -1.6260554790496826, "umap_1": 7.08851432800293, "region": "Greek", "recipe": "BBQ Chicken With Greek Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109161"}, {"umap_0": -0.000515283492859453, "umap_1": 12.56138801574707, "region": "Mexican", "recipe": "Fajita Steak Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84756"}, {"umap_0": -1.5862261056900024, "umap_1": 6.036340236663818, "region": "Italian", "recipe": "Italian Kale With Turkey Sausage and Penne", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130344"}, {"umap_0": 0.7508050203323364, "umap_1": 13.062902450561523, "region": "Indian Subcontinent", "recipe": "Chole Masala", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64892"}, {"umap_0": 2.606248617172241, "umap_1": 9.582758903503418, "region": "South American", "recipe": "Noodles With Honey-Balsamic Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94230"}, {"umap_0": 9.924560546875, "umap_1": 6.115597248077393, "region": "Rest Africa", "recipe": "Malva Pudding - Family's Favorite =p", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51232"}, {"umap_0": 7.564727783203125, "umap_1": 6.725115776062012, "region": "South American", "recipe": "Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93201"}, {"umap_0": -0.8886260390281677, "umap_1": 4.601931095123291, "region": "Italian", "recipe": "Crock Pot Fondue Italiano", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132796"}, {"umap_0": -0.3406410217285156, "umap_1": 11.568140983581543, "region": "Middle Eastern", "recipe": "Fooll Mudammes (Fava Bean Egyptian Breakfast)(\u0641\u0648\u0644 \u0645\u062f\u0645\u0633)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47273"}, {"umap_0": -0.9585955739021301, "umap_1": 9.413331985473633, "region": "Northern Africa", "recipe": "Moroccan Red Gazpacho", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50366"}, {"umap_0": -1.5916132926940918, "umap_1": 9.849132537841797, "region": "Greek", "recipe": "Alternative Greek Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107630"}, {"umap_0": -0.9777724742889404, "umap_1": 8.05307388305664, "region": "Italian", "recipe": "Risotto With Chicken and Caramelized Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122922"}, {"umap_0": -0.07009800523519516, "umap_1": 4.914054870605469, "region": "Scandinavian", "recipe": "Cream Cheese and Salmon Smorrebrod", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139892"}, {"umap_0": 2.257594347000122, "umap_1": 8.318817138671875, "region": "Rest Africa", "recipe": "South African Chicken - Inyama Yenkukhu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51251"}, {"umap_0": -1.3626586198806763, "umap_1": 7.3130784034729, "region": "Italian", "recipe": "Paul's Italian-Style Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122340"}, {"umap_0": 2.269575834274292, "umap_1": 7.972594738006592, "region": "French", "recipe": "Baked French Fries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111590"}, {"umap_0": 3.0306313037872314, "umap_1": 10.520716667175293, "region": "Chinese and Mongolian", "recipe": "P F Chang's Sichuan-Style Asparagus", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55447"}, {"umap_0": 1.5169289112091064, "umap_1": 9.936500549316406, "region": "Northern Africa", "recipe": "Exotic and Sweet Moroccan Chicken with Spicy Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50955"}, {"umap_0": -0.9425987005233765, "umap_1": 10.629751205444336, "region": "Mexican", "recipe": "A+ Sweet &amp; Savory Plantains (Or Banana) Appetizers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85588"}, {"umap_0": -2.378746509552002, "umap_1": 7.114105224609375, "region": "Greek", "recipe": "Orzo Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110909"}, {"umap_0": -1.2278001308441162, "umap_1": 7.524376392364502, "region": "Italian", "recipe": "Plum Tomatoes and Artichokes With Penne", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128086"}, {"umap_0": 3.5802395343780518, "umap_1": 9.519756317138672, "region": "Chinese and Mongolian", "recipe": "Stir-Fried Prawns with Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56583"}, {"umap_0": -0.5673277378082275, "umap_1": 9.488524436950684, "region": "Greek", "recipe": "Vegetable Orzo with Lemon &amp; Herb Seasoning", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108743"}, {"umap_0": 11.21760368347168, "umap_1": 7.592402935028076, "region": "Middle Eastern", "recipe": "Date Brownies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3490"}, {"umap_0": -1.5351697206497192, "umap_1": 9.6422758102417, "region": "Middle Eastern", "recipe": "Tabbouleh", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3518"}, {"umap_0": 0.2601361870765686, "umap_1": 8.095123291015625, "region": "Canadian", "recipe": "The Best Pork Chop Dinner - EVER!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142781"}, {"umap_0": 1.6744741201400757, "umap_1": 5.47749137878418, "region": "Mexican", "recipe": "Zesty Dip for Chips", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7861"}, {"umap_0": 0.798533022403717, "umap_1": 8.00855541229248, "region": "Australian", "recipe": "Jennifer's Shepherd's Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74429"}, {"umap_0": -1.5483787059783936, "umap_1": 10.348699569702148, "region": "Spanish and Portuguese", "recipe": "Pollo en Mojo (well sort of)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141287"}, {"umap_0": -1.224749207496643, "umap_1": 11.068684577941895, "region": "Mexican", "recipe": "Mexican Ceviche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78509"}, {"umap_0": 0.38385018706321716, "umap_1": 5.342156887054443, "region": "Canadian", "recipe": "Bacon and Egg Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148197"}, {"umap_0": -1.0520628690719604, "umap_1": 10.406288146972656, "region": "Mexican", "recipe": "Guacamole Salad (Barefoot Contessa) Ina Garten", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82485"}, {"umap_0": 1.46034574508667, "umap_1": 9.320054054260254, "region": "Belgian", "recipe": "Pennsylvania Dutch Home Baked Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134377"}, {"umap_0": 11.760978698730469, "umap_1": 7.804986953735352, "region": "Southeast Asian", "recipe": "Heavenly Chocomallows", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63587"}, {"umap_0": 3.2780463695526123, "umap_1": 9.204833030700684, "region": "Japanese", "recipe": "Shiitake Dashi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70633"}, {"umap_0": -0.0033222923520952463, "umap_1": 10.11827278137207, "region": "US", "recipe": "Creamy Cilantro Salad Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18041"}, {"umap_0": 1.1248903274536133, "umap_1": 9.54050064086914, "region": "Indian Subcontinent", "recipe": "Crazy Spicy Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4024"}, {"umap_0": 3.1560261249542236, "umap_1": 8.235814094543457, "region": "Chinese and Mongolian", "recipe": "Chow Mein Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55472"}, {"umap_0": -1.2940279245376587, "umap_1": 5.540240287780762, "region": "French", "recipe": "Baked Escargot in Mushroom Caps", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115033"}, {"umap_0": 3.2141611576080322, "umap_1": 10.50078010559082, "region": "Japanese", "recipe": "Bang Bang Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70857"}, {"umap_0": -0.5370005369186401, "umap_1": 7.8642449378967285, "region": "French", "recipe": "Provencal Roast Lamb Stuffed With Figs, Goat's Cheese and Walnut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112877"}, {"umap_0": 0.5719629526138306, "umap_1": 4.872244834899902, "region": "Chinese and Mongolian", "recipe": "Broccoli Casserole and Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53432"}, {"umap_0": 10.1602783203125, "umap_1": 6.847714900970459, "region": "Scandinavian", "recipe": "Cognac\u00e6bleskiver - Danish - Cognac Apple Fritters", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100933"}, {"umap_0": 1.0897228717803955, "umap_1": 6.974140167236328, "region": "Canadian", "recipe": "Creamy Lemon Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145347"}, {"umap_0": 0.6935108304023743, "umap_1": 8.633729934692383, "region": "South American", "recipe": "Kickin Vegetable Beef Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98996"}, {"umap_0": 1.4681605100631714, "umap_1": 10.89324951171875, "region": "Middle Eastern", "recipe": "Saffron Couscous With Herbs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61577"}, {"umap_0": 9.679351806640625, "umap_1": 7.064289093017578, "region": "Belgian", "recipe": "Dutch Baby With Cranberry Orange Syrup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134415"}, {"umap_0": 0.4579998552799225, "umap_1": 5.721771240234375, "region": "Eastern European", "recipe": "Hungarian Noodle Side Dish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9061"}, {"umap_0": 0.6859077215194702, "umap_1": 4.895043849945068, "region": "South American", "recipe": "Rancher Beef Pita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96889"}, {"umap_0": 9.46788215637207, "umap_1": 8.70071792602539, "region": "Mexican", "recipe": "How to Make Dulce de Leche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6487"}, {"umap_0": 0.5781883597373962, "umap_1": 12.659963607788086, "region": "Indian Subcontinent", "recipe": "Cumin Scented Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65315"}, {"umap_0": 11.278326034545898, "umap_1": 8.992044448852539, "region": "Irish", "recipe": "Irish Coffee from the Buena Vista San Francisco", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136130"}, {"umap_0": 1.748248815536499, "umap_1": 6.023683071136475, "region": "Mexican", "recipe": "Shredded Chicken Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85050"}, {"umap_0": 11.584966659545898, "umap_1": 8.026214599609375, "region": "Italian", "recipe": "Gingerbread Latte", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130130"}, {"umap_0": 3.024116039276123, "umap_1": 9.615269660949707, "region": "Japanese", "recipe": "Miso Salmon Spaghetti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71938"}, {"umap_0": 2.7208845615386963, "umap_1": 10.148514747619629, "region": "Southeast Asian", "recipe": "Chicken in Lemongrass (Ga Xao Xa Ot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62435"}, {"umap_0": 2.4865505695343018, "umap_1": 8.89186954498291, "region": "Mexican", "recipe": "Easy Pototo Taquitos #5FIX", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80253"}, {"umap_0": 1.5574126243591309, "umap_1": 5.279350280761719, "region": "Mexican", "recipe": "Chicken Frito Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89212"}, {"umap_0": -1.0496069192886353, "umap_1": 6.808690547943115, "region": "Italian", "recipe": "Olive Garden Chicken Caprese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127473"}, {"umap_0": 1.634734034538269, "umap_1": 5.269521236419678, "region": "Mexican", "recipe": "Quesadilla Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81330"}, {"umap_0": -0.7381545305252075, "umap_1": 10.405523300170898, "region": "South American", "recipe": "Braised Beef and Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97146"}, {"umap_0": 11.534550666809082, "umap_1": 8.163215637207031, "region": "Deutschland", "recipe": "Upside-Down German Snack Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138452"}, {"umap_0": 0.3418410122394562, "umap_1": 7.589414119720459, "region": "Italian", "recipe": "Lazy Chicken Marsala", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125555"}, {"umap_0": 10.810623168945312, "umap_1": 9.988347053527832, "region": "Australian", "recipe": "Flying Kangaroo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75647"}, {"umap_0": -0.06423194706439972, "umap_1": 8.653255462646484, "region": "Chinese and Mongolian", "recipe": "Slow-Cooked Harvest Vegetable and Rice Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54143"}, {"umap_0": 2.9338090419769287, "umap_1": 9.77122688293457, "region": "Chinese and Mongolian", "recipe": "Crock Pot Oriental Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52411"}, {"umap_0": 3.107837438583374, "umap_1": 9.255471229553223, "region": "Australian", "recipe": "Hot Potato Cakes With a Twist, and Fresh Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75079"}, {"umap_0": 0.19716471433639526, "umap_1": 11.994739532470703, "region": "Mexican", "recipe": "Hot Mexican Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86791"}, {"umap_0": 1.7486412525177002, "umap_1": 4.5669097900390625, "region": "South American", "recipe": "Cheesy Beef Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98343"}, {"umap_0": -0.9900819063186646, "umap_1": 9.14884090423584, "region": "Caribbean", "recipe": "Low Fat Red Beans and Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90654"}, {"umap_0": -1.199900507926941, "umap_1": 6.398742198944092, "region": "Italian", "recipe": "Italian Sausage and Spinach Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127062"}, {"umap_0": 0.9132679104804993, "umap_1": 7.430976867675781, "region": "UK", "recipe": "English Sausage and Bean  Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104326"}, {"umap_0": -2.333700656890869, "umap_1": 6.373197078704834, "region": "Italian", "recipe": "Eggplant Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122171"}, {"umap_0": 1.5788544416427612, "umap_1": 7.845282554626465, "region": "Canadian", "recipe": "Tender and Tasty Roast Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147232"}, {"umap_0": -0.7495318055152893, "umap_1": 5.995314598083496, "region": "Greek", "recipe": "Tomato-Feta-Greek Yogurt Phyllo Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110319"}, {"umap_0": -0.5220938920974731, "umap_1": 5.3779377937316895, "region": "French", "recipe": "Brie and Red Currant Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112235"}, {"umap_0": -1.7691675424575806, "umap_1": 8.310100555419922, "region": "Australian", "recipe": "Pea, Feta and Mint Salad With Pistachios", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77192"}, {"umap_0": 2.1125428676605225, "umap_1": 6.313201427459717, "region": "Mexican", "recipe": "Tacos De Pollo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85813"}, {"umap_0": 1.1592556238174438, "umap_1": 9.418966293334961, "region": "Indian Subcontinent", "recipe": "Steve's Chicken Korma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4128"}, {"umap_0": -1.6128437519073486, "umap_1": 9.288741111755371, "region": "Greek", "recipe": "Baked Cod with Greek Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109566"}, {"umap_0": 3.8773720264434814, "umap_1": 9.406545639038086, "region": "Japanese", "recipe": "Japanese Scrambled Eggs with Pacific Saury", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5071"}, {"umap_0": 3.343989372253418, "umap_1": 10.000374794006348, "region": "Chinese and Mongolian", "recipe": "Shun Lee's Szechuan Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55363"}, {"umap_0": 1.3290810585021973, "umap_1": 6.70897912979126, "region": "Chinese and Mongolian", "recipe": "Chinese Curried Shrimp and Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57082"}, {"umap_0": 0.2156497687101364, "umap_1": 12.363967895507812, "region": "Mexican", "recipe": "Super Fajita Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6160"}, {"umap_0": -0.010571273043751717, "umap_1": 7.808695316314697, "region": "Italian", "recipe": "Roasted Chicken With Lemons and Thyme", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125379"}, {"umap_0": 9.607107162475586, "umap_1": 7.821794509887695, "region": "Spanish and Portuguese", "recipe": "Aunt Fanny's Portuguese Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118895"}, {"umap_0": 0.20657289028167725, "umap_1": 4.87691593170166, "region": "Irish", "recipe": "Shamrock Corned Beef Reuben Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136315"}, {"umap_0": -0.4607025980949402, "umap_1": 11.211666107177734, "region": "Mexican", "recipe": "Mexicali Pork Chops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80759"}, {"umap_0": -1.1594617366790771, "umap_1": 4.619128227233887, "region": "Italian", "recipe": "Sausage and Spinach Tortellini Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128055"}, {"umap_0": 1.0305882692337036, "umap_1": 11.929807662963867, "region": "Middle Eastern", "recipe": "Stuffed Beets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72502"}, {"umap_0": 0.053588833659887314, "umap_1": 7.157907485961914, "region": "Irish", "recipe": "Irish Pub Soup (From 365 Easy One Dish Meals)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136491"}, {"umap_0": 10.691167831420898, "umap_1": 6.999594688415527, "region": "Australian", "recipe": "Oven Caramel Corn", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77246"}, {"umap_0": 0.8105179071426392, "umap_1": 12.083481788635254, "region": "Indian Subcontinent", "recipe": "Coconut Butter Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67996"}, {"umap_0": 11.624321937561035, "umap_1": 8.709522247314453, "region": "Italian", "recipe": "Tiramisu (An Italian Delight!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120662"}, {"umap_0": -1.0343867540359497, "umap_1": 9.783365249633789, "region": "Italian", "recipe": "Savory Green Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10310"}, {"umap_0": -1.4338454008102417, "umap_1": 5.912444114685059, "region": "Italian", "recipe": "Tuscan Bean, Chicken, and Italian Sausage Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10233"}, {"umap_0": -0.9197416305541992, "umap_1": 6.257259845733643, "region": "Italian", "recipe": "Kim's Italian Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121882"}, {"umap_0": 8.84247875213623, "umap_1": 7.164631366729736, "region": "Eastern European", "recipe": "Hungarian Poppy Seed Roll (Beigli)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105710"}, {"umap_0": 1.770277738571167, "umap_1": 11.02878189086914, "region": "Southeast Asian", "recipe": "Vietnamese Lettuce Rolls With Spicy Grilled Tofu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62082"}, {"umap_0": 10.831842422485352, "umap_1": 7.926901817321777, "region": "French", "recipe": "Walnut Raisin French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113115"}, {"umap_0": 1.2650885581970215, "umap_1": 4.821526050567627, "region": "Mexican", "recipe": "Texas Taco Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79787"}, {"umap_0": -0.12227872759103775, "umap_1": 10.868971824645996, "region": "South American", "recipe": "Peruvian Ceviche With Pickled Red Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93916"}, {"umap_0": 9.949121475219727, "umap_1": 6.6415534019470215, "region": "UK", "recipe": "Cranberry Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8998"}, {"umap_0": -0.10735031217336655, "umap_1": 10.361869812011719, "region": "Mexican", "recipe": "Camarones Con Frijole Sopa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85277"}, {"umap_0": 9.936429023742676, "umap_1": 9.555929183959961, "region": "Italian", "recipe": "Iced Espresso Marvo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130186"}, {"umap_0": -2.101750373840332, "umap_1": 9.796384811401367, "region": "Mexican", "recipe": "Jicama Salad (Mexico)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79600"}, {"umap_0": 1.458834171295166, "umap_1": 9.37569522857666, "region": "South American", "recipe": "Best Ever Beef Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98374"}, {"umap_0": -1.0927454233169556, "umap_1": 5.387875556945801, "region": "Italian", "recipe": "Parmesan Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10372"}, {"umap_0": -1.071266770362854, "umap_1": 9.169951438903809, "region": "Italian", "recipe": "Garlic Polenta With Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126903"}, {"umap_0": 10.816166877746582, "umap_1": 8.989754676818848, "region": "French", "recipe": "Creme Glacee Aux Myrtilles (Frozen Blueberry Cream)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115702"}, {"umap_0": -2.5517635345458984, "umap_1": 7.902929782867432, "region": "Italian", "recipe": "Grilled Veg and Almond Ricotta Crostinis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126803"}, {"umap_0": -1.1118669509887695, "umap_1": 9.731581687927246, "region": "UK", "recipe": "Ginger Duck and Orange Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104493"}, {"umap_0": 1.6856554746627808, "umap_1": 11.480337142944336, "region": "Southeast Asian", "recipe": "Vietnamese Fish with Lemongrass and Chilli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62256"}, {"umap_0": -2.4938180446624756, "umap_1": 8.836197853088379, "region": "French", "recipe": "Grape Picker's Sausages With Grapes, Thyme and Wine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115258"}, {"umap_0": -1.139853596687317, "umap_1": 5.236851692199707, "region": "Italian", "recipe": "Italian Marinated Chicken Breasts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123054"}, {"umap_0": -1.1536849737167358, "umap_1": 11.75439167022705, "region": "Northern Africa", "recipe": "Charmoula Mayonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49869"}, {"umap_0": -0.8700478076934814, "umap_1": 5.953568935394287, "region": "US", "recipe": "Tofu Creamed Spinach", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15054"}, {"umap_0": -0.1698608547449112, "umap_1": 6.1382880210876465, "region": "Mexican", "recipe": "Mexican Corn Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87201"}, {"umap_0": 10.885424613952637, "umap_1": 8.53498363494873, "region": "Irish", "recipe": "Irish Cream Almond Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136769"}, {"umap_0": 1.6548129320144653, "umap_1": 9.861000061035156, "region": "South American", "recipe": "Beef Tenderloin With Henry Bain Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95333"}, {"umap_0": 2.0236690044403076, "umap_1": 7.660715103149414, "region": "Mexican", "recipe": "Cactus Chicken Fried Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89453"}, {"umap_0": 1.4477132558822632, "umap_1": 5.137422561645508, "region": "Central American", "recipe": "Party Roll-ups:  an American, an Englishman, and an Italian", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93024"}, {"umap_0": 1.9531077146530151, "umap_1": 5.021805286407471, "region": "South American", "recipe": "Salsa Beef Skillet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96475"}, {"umap_0": 0.28648433089256287, "umap_1": 9.895027160644531, "region": "South American", "recipe": "Spicy Beef Goulash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96638"}, {"umap_0": 0.9185366034507751, "umap_1": 8.543752670288086, "region": "Irish", "recipe": "Veggie Guinness Stew (Vegan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135142"}, {"umap_0": 11.295226097106934, "umap_1": 7.38368558883667, "region": "French", "recipe": "Overnight Strawberry-Banana French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112233"}, {"umap_0": -1.8786512613296509, "umap_1": 10.2505521774292, "region": "Australian", "recipe": "Healthy Grilled Herbed Fish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75656"}, {"umap_0": -1.468809962272644, "umap_1": 9.261617660522461, "region": "Caribbean", "recipe": "Black Bean and Olive Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92548"}, {"umap_0": -1.3976020812988281, "umap_1": 11.138792991638184, "region": "Mexican", "recipe": "Rio Brava Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83716"}, {"umap_0": -0.7882421016693115, "umap_1": 8.978182792663574, "region": "Italian", "recipe": "Italian Pot Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125356"}, {"umap_0": 11.918304443359375, "umap_1": 10.068343162536621, "region": "Irish", "recipe": "Creme De Menthe Mousse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135088"}, {"umap_0": 10.849958419799805, "umap_1": 9.31883716583252, "region": "US", "recipe": "Ambrosia Salad II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17128"}, {"umap_0": -0.34674811363220215, "umap_1": 8.717296600341797, "region": "Irish", "recipe": "Irish Corned Beef With Cabbage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135074"}, {"umap_0": -1.522047758102417, "umap_1": 5.529359817504883, "region": "Australian", "recipe": "Amazing Stuffed Capsicum", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73560"}, {"umap_0": 1.1962745189666748, "umap_1": 6.49261999130249, "region": "Japanese", "recipe": "Crab Crisps -  Kani Senbei", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70437"}, {"umap_0": 0.6921911835670471, "umap_1": 8.382219314575195, "region": "Canadian", "recipe": "Hamburger Macaroni Crock Pot Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145340"}, {"umap_0": 11.03760814666748, "umap_1": 8.769964218139648, "region": "UK", "recipe": "Nana's English Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104800"}, {"umap_0": 1.1344938278198242, "umap_1": 12.94425106048584, "region": "Indian Subcontinent", "recipe": "Kabuli Chana Masala Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63875"}, {"umap_0": 1.4630522727966309, "umap_1": 9.666353225708008, "region": "Canadian", "recipe": "Oven Spareribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142867"}, {"umap_0": 9.789596557617188, "umap_1": 8.079959869384766, "region": "Scandinavian", "recipe": "Danish Krasekager, Marzipan Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100938"}, {"umap_0": -0.028178738430142403, "umap_1": 7.694163799285889, "region": "Italian", "recipe": "Chicken Piccata with Fettuccine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11454"}, {"umap_0": 11.998320579528809, "umap_1": 4.721621036529541, "region": "Eastern European", "recipe": "Rice Flour Crepes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133465"}, {"umap_0": -1.1336603164672852, "umap_1": 7.587290287017822, "region": "US", "recipe": "Crab and Shrimp Louis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17935"}, {"umap_0": 9.57773494720459, "umap_1": 6.368839740753174, "region": "Canadian", "recipe": "Whole Wheat Waffles With Sauteed Apples and Cranberries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148719"}, {"umap_0": 1.9391995668411255, "umap_1": 4.642899036407471, "region": "Mexican", "recipe": "Quesadillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78544"}, {"umap_0": -1.4901093244552612, "umap_1": 9.10964584350586, "region": "Greek", "recipe": "Greek Beef Kabobs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110249"}, {"umap_0": -0.9194230437278748, "umap_1": 4.757503032684326, "region": "Canadian", "recipe": "Canadian Bacon and Brie Quiche", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148708"}, {"umap_0": -0.8528251647949219, "umap_1": 8.821273803710938, "region": "Canadian", "recipe": "Creamy and Crunchy Tuna Salad Supreme", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18785"}, {"umap_0": -1.6420037746429443, "umap_1": 11.2222900390625, "region": "Middle Eastern", "recipe": "Robin's Best Ever Hummus", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3649"}, {"umap_0": 1.3122022151947021, "umap_1": 8.742935180664062, "region": "Chinese and Mongolian", "recipe": "Mix Vegetable Pakoras (chinese style)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56971"}, {"umap_0": -1.571243166923523, "umap_1": 5.437544822692871, "region": "Italian", "recipe": "Easy Pizza Pasta Casserole (OAMC)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131318"}, {"umap_0": 1.8441039323806763, "umap_1": 7.421917915344238, "region": "Greek", "recipe": "Chicken Avgolemono", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110091"}, {"umap_0": 2.304490327835083, "umap_1": 10.244935989379883, "region": "Thai", "recipe": "Grilled Thai Beef Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58451"}, {"umap_0": 10.359602928161621, "umap_1": 6.390215873718262, "region": "Mexican", "recipe": "Mexican Hot Chocolate Snickerdoodles (Vegan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85053"}, {"umap_0": -0.16698792576789856, "umap_1": 9.321743965148926, "region": "Indian Subcontinent", "recipe": "West Indian Rum Stew Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64303"}, {"umap_0": -1.2580128908157349, "umap_1": 8.645570755004883, "region": "South American", "recipe": "Rosemary Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8512"}, {"umap_0": -0.11257972568273544, "umap_1": 5.336622714996338, "region": "Canadian", "recipe": "Cheezy Green Olive Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144906"}, {"umap_0": 9.351024627685547, "umap_1": 8.52212142944336, "region": "Central American", "recipe": "Honduras Torrejas ( Ladyfingers in Syrup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92717"}, {"umap_0": -0.394025593996048, "umap_1": 8.105586051940918, "region": "Australian", "recipe": "Pineapple Cole Slaw With Apple &amp; Marshmallows", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74931"}, {"umap_0": 2.607651948928833, "umap_1": 7.844747543334961, "region": "Indian Subcontinent", "recipe": "Curry Crumbed Chicken Schnitzels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68944"}, {"umap_0": 10.120142936706543, "umap_1": 7.455615043640137, "region": "Middle Eastern", "recipe": "Raw Vegan Nutella", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70023"}, {"umap_0": 10.536667823791504, "umap_1": 5.85667610168457, "region": "Chinese and Mongolian", "recipe": "Chinese Almond Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53002"}, {"umap_0": -0.7199134826660156, "umap_1": 9.058918952941895, "region": "South American", "recipe": "Grilled Prime Ribs of Beef With Garlic and Rosemary", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94903"}, {"umap_0": 0.8016179203987122, "umap_1": 6.586182594299316, "region": "Canadian", "recipe": "Hamburger Quickie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146506"}, {"umap_0": 11.00777816772461, "umap_1": 6.966134071350098, "region": "Australian", "recipe": "Fun Mini Muffins -Almond Meal- Flourless", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76407"}, {"umap_0": 0.40706926584243774, "umap_1": 7.468785762786865, "region": "Irish", "recipe": "Garlic Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135542"}, {"umap_0": 10.113068580627441, "umap_1": 7.425827980041504, "region": "Canadian", "recipe": "Acadian Cranberry Pie - Canada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147068"}, {"umap_0": 10.305254936218262, "umap_1": 7.997673511505127, "region": "Greek", "recipe": "Baklava I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9433"}, {"umap_0": -2.232785224914551, "umap_1": 8.084171295166016, "region": "Italian", "recipe": "Italian Tuna Melts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131745"}, {"umap_0": 1.244853138923645, "umap_1": 8.72779655456543, "region": "Canadian", "recipe": "Simple Sweet and Sour Chops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146990"}, {"umap_0": -0.6168350577354431, "umap_1": 12.034870147705078, "region": "Caribbean", "recipe": "Jamaican Jerk Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91483"}, {"umap_0": 1.1446646451950073, "umap_1": 12.14500904083252, "region": "Southeast Asian", "recipe": "Kari Ayam Kelapa (Curried chicken with toasted coconut)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59920"}, {"umap_0": -1.949479579925537, "umap_1": 10.577474594116211, "region": "Italian", "recipe": "Awesomely Easy Marinara Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124644"}, {"umap_0": 2.1451356410980225, "umap_1": 9.919189453125, "region": "Chinese and Mongolian", "recipe": "Chinese Muslim Stir-Fried Beef and Celery (Qin Cai Chao Niu Rou)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56457"}, {"umap_0": -2.591780185699463, "umap_1": 6.8310041427612305, "region": "Italian", "recipe": "Walnut Pesto  (Paula Deen)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128283"}, {"umap_0": -0.7087075710296631, "umap_1": 6.0253400802612305, "region": "Italian", "recipe": "Vegan Gnocchi With Tomato Cream Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129289"}, {"umap_0": 0.43849554657936096, "umap_1": 10.27269172668457, "region": "Indian Subcontinent", "recipe": "Chana Ko Tarkari", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59647"}, {"umap_0": -0.42470529675483704, "umap_1": 7.816586017608643, "region": "Spanish and Portuguese", "recipe": "Portuguese Turkey Stuffing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118697"}, {"umap_0": -0.7156092524528503, "umap_1": 6.06956672668457, "region": "Greek", "recipe": "Three-Cheese Phyllo Triangles With Onions and Yogurt", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110192"}, {"umap_0": 0.2427350878715515, "umap_1": 5.417876720428467, "region": "UK", "recipe": "English Muffins With Bacon Butter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104190"}, {"umap_0": -1.3072150945663452, "umap_1": 8.251768112182617, "region": "Canadian", "recipe": "Salmon Rillettes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147793"}, {"umap_0": -1.2602077722549438, "umap_1": 4.711234092712402, "region": "Italian", "recipe": "Stuffed Pasta Shells", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132738"}, {"umap_0": 10.495482444763184, "umap_1": 9.298541069030762, "region": "Southeast Asian", "recipe": "Cambodian Banana Dessert", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62712"}, {"umap_0": 2.0359106063842773, "umap_1": 6.208338260650635, "region": "Mexican", "recipe": "Fixed up Spanish Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77974"}, {"umap_0": -0.6023274660110474, "umap_1": 5.456448078155518, "region": "UK", "recipe": "Love It or Hate It - Marmite and Cheese Straws With a Twist!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101992"}, {"umap_0": -0.40834954380989075, "umap_1": 8.52096939086914, "region": "Deutschland", "recipe": "Braised Wild Boar Back", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137573"}, {"umap_0": 0.24316124618053436, "umap_1": 9.96104907989502, "region": "Canadian", "recipe": "Mike's Sweet Chili Surprise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147116"}, {"umap_0": 10.19580078125, "umap_1": 7.699525356292725, "region": "Italian", "recipe": "Amaretti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10953"}, {"umap_0": 11.090867042541504, "umap_1": 6.297987461090088, "region": "Scandinavian", "recipe": "Almond Butter Balls - Danish Mandelsm\u00f8rboller", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100894"}, {"umap_0": -0.7224007248878479, "umap_1": 6.452785968780518, "region": "Italian", "recipe": "Pasta Shrimp Toss", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125552"}, {"umap_0": 1.8311090469360352, "umap_1": 6.167548179626465, "region": "Mexican", "recipe": "Chicken Tortilla Soup With Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84685"}, {"umap_0": 0.4322412610054016, "umap_1": 4.79233455657959, "region": "Canadian", "recipe": "Sweet &amp; Salty Grilled Cheese Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144591"}, {"umap_0": -1.3548874855041504, "umap_1": 10.589512825012207, "region": "Northern Africa", "recipe": "Spiced Eggplant (Aubergine) Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50512"}, {"umap_0": 3.4463462829589844, "umap_1": 9.780030250549316, "region": "Chinese and Mongolian", "recipe": "Tangerine Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55226"}, {"umap_0": -1.6261612176895142, "umap_1": 10.35120964050293, "region": "US", "recipe": "Grilled Montana Trout", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13731"}, {"umap_0": -0.503948986530304, "umap_1": 5.466320037841797, "region": "Italian", "recipe": "Baked Fennel with Parmesan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10787"}, {"umap_0": -1.7699774503707886, "umap_1": 5.705374240875244, "region": "Italian", "recipe": "Pretty Party Pasta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122246"}, {"umap_0": -1.963655948638916, "umap_1": 7.469742774963379, "region": "Italian", "recipe": "Roman-Style Spaghetti With Garlic and Olive Oil", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122720"}, {"umap_0": -1.174747109413147, "umap_1": 6.197281837463379, "region": "Italian", "recipe": "Chicken Asiago Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125369"}, {"umap_0": 0.08443750441074371, "umap_1": 6.467732906341553, "region": "Eastern European", "recipe": "Hungarian Potato and Egg Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105795"}, {"umap_0": -0.8714216947555542, "umap_1": 5.548753261566162, "region": "Italian", "recipe": "Dandelion Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123763"}, {"umap_0": -1.8009405136108398, "umap_1": 8.431380271911621, "region": "Greek", "recipe": "Greek Caponata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109405"}, {"umap_0": 1.954458236694336, "umap_1": 8.671904563903809, "region": "Mexican", "recipe": "Pork Chops With Cumin Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86419"}, {"umap_0": 1.7035311460494995, "umap_1": 11.042461395263672, "region": "Indian Subcontinent", "recipe": "Chicken Hariyali Tikka", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4051"}, {"umap_0": 10.80089282989502, "umap_1": 6.933441638946533, "region": "Eastern European", "recipe": "Jam Squares", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105549"}, {"umap_0": 11.060379028320312, "umap_1": 8.70517635345459, "region": "Italian", "recipe": "Strawberries &amp; Mascarpone Cream With Amaretti Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121323"}, {"umap_0": -1.6337313652038574, "umap_1": 6.649634838104248, "region": "Italian", "recipe": "Pasta Fagioli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127785"}, {"umap_0": 0.1645144820213318, "umap_1": 5.26685905456543, "region": "South American", "recipe": "Tater Tot, Beef &amp; Cheese Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98909"}, {"umap_0": 9.891934394836426, "umap_1": 7.50678825378418, "region": "Chinese and Mongolian", "recipe": "Milchreis (German Rice Pudding)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53801"}, {"umap_0": 11.150531768798828, "umap_1": 6.2199931144714355, "region": "Eastern European", "recipe": "Libbies Celestial Crusts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142502"}, {"umap_0": -1.3617044687271118, "umap_1": 8.202807426452637, "region": "US", "recipe": "Dee's Sexy, Spicy, Shrimp, Sausage, and Peppers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15336"}, {"umap_0": -1.248594880104065, "umap_1": 9.778515815734863, "region": "Spanish and Portuguese", "recipe": "Spanish White Bean Tortilla With Piri-Piri Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141757"}, {"umap_0": 0.4271888732910156, "umap_1": 8.268725395202637, "region": "UK", "recipe": "Liver and Onions With Roast Veges", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104890"}, {"umap_0": 0.8311628699302673, "umap_1": 5.722709655761719, "region": "Mexican", "recipe": "Freezer Sausage  and Egg Breakfast Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85153"}, {"umap_0": 2.4494593143463135, "umap_1": 10.893613815307617, "region": "Thai", "recipe": "Spicy Devil's Tom Yum Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3179"}, {"umap_0": 10.68728256225586, "umap_1": 7.74041748046875, "region": "Scandinavian", "recipe": "Finnish Turnips", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13684"}, {"umap_0": -0.9320008158683777, "umap_1": 7.229828834533691, "region": "Italian", "recipe": "Pasta With Prosciutto in a Lemon Cream Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129941"}, {"umap_0": -0.26356375217437744, "umap_1": 10.286710739135742, "region": "Thai", "recipe": "Saus Prik Sticky Rice Sauce)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58872"}, {"umap_0": -0.046355120837688446, "umap_1": 12.459877967834473, "region": "Canadian", "recipe": "Spiced Cauliflower With a Kick", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144204"}, {"umap_0": 2.823256731033325, "umap_1": 10.540130615234375, "region": "Chinese and Mongolian", "recipe": "Butter Lime Cilantro Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54186"}, {"umap_0": 0.8476681113243103, "umap_1": 5.892983436584473, "region": "Canadian", "recipe": "Ham and Cheese Puffs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146500"}, {"umap_0": 10.916544914245605, "umap_1": 9.311678886413574, "region": "Mexican", "recipe": "Joe's Famous Michelada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6517"}, {"umap_0": 0.2030342072248459, "umap_1": 8.892056465148926, "region": "South American", "recipe": "Mom's Pepper Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96318"}, {"umap_0": 0.6628851890563965, "umap_1": 8.37923812866211, "region": "South American", "recipe": "Bistro Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98181"}, {"umap_0": -0.3064821660518646, "umap_1": 10.746016502380371, "region": "Middle Eastern", "recipe": "Middle Eastern Chicken Kebabs in Honey-Soy Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60143"}, {"umap_0": -1.9981900453567505, "umap_1": 10.156318664550781, "region": "Middle Eastern", "recipe": "Classic Yogurt Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61945"}, {"umap_0": -1.644325852394104, "umap_1": 8.343690872192383, "region": "Rest Africa", "recipe": "Algerian Rice Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48679"}, {"umap_0": -1.5057601928710938, "umap_1": 8.401256561279297, "region": "French", "recipe": "French Bistro Steak and Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113982"}, {"umap_0": 1.4272202253341675, "umap_1": 6.7758564949035645, "region": "US", "recipe": "Tennessee Eggs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18183"}, {"umap_0": -0.4038572609424591, "umap_1": 5.603295803070068, "region": "French", "recipe": "Cauliflower Flan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117227"}, {"umap_0": -2.774646043777466, "umap_1": 7.75654935836792, "region": "Greek", "recipe": "Balsamic &amp; Feta  Cheese Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109706"}, {"umap_0": 7.632309436798096, "umap_1": 6.748541831970215, "region": "French", "recipe": "French Honey Bread Abm", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112949"}, {"umap_0": 0.5723637342453003, "umap_1": 10.387426376342773, "region": "Mexican", "recipe": "Mexican Crock-Pot Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88713"}, {"umap_0": 0.5806989073753357, "umap_1": 8.598536491394043, "region": "South American", "recipe": "Apple Cider Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96290"}, {"umap_0": 1.000257134437561, "umap_1": 8.229677200317383, "region": "Deutschland", "recipe": "All Pumpkins' Night Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138408"}, {"umap_0": 3.3223419189453125, "umap_1": 10.342912673950195, "region": "Chinese and Mongolian", "recipe": "Chinese Barbecued Chicken Wings", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57196"}, {"umap_0": -0.6928055286407471, "umap_1": 5.213754177093506, "region": "Italian", "recipe": "Spinach Tomato Tortellini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12494"}, {"umap_0": 9.983616828918457, "umap_1": 7.842149257659912, "region": "Chinese and Mongolian", "recipe": "Creamy Stove-Top Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54310"}, {"umap_0": -1.6628121137619019, "umap_1": 5.365706443786621, "region": "Italian", "recipe": "Al's Garlic Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122073"}, {"umap_0": 1.8710174560546875, "umap_1": 5.851068019866943, "region": "Mexican", "recipe": "Real Mexican Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82751"}, {"umap_0": -2.5632143020629883, "umap_1": 9.503864288330078, "region": "Greek", "recipe": "Lemon Oregano Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110189"}, {"umap_0": 11.128646850585938, "umap_1": 8.859893798828125, "region": "Australian", "recipe": "Peaches and Ice Cream, Tempted yet ?", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77557"}, {"umap_0": -1.4949678182601929, "umap_1": 5.864278316497803, "region": "Italian", "recipe": "Winnipeg's Spicy Pasta &amp; Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127522"}, {"umap_0": 1.350890874862671, "umap_1": 6.048244953155518, "region": "Mexican", "recipe": "Mexican Rice Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89676"}, {"umap_0": 0.037330262362957, "umap_1": 9.225337982177734, "region": "Deutschland", "recipe": "German Red Cabbage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138707"}, {"umap_0": 10.962918281555176, "umap_1": 7.244486331939697, "region": "French", "recipe": "Julia Child's Mousseline Au Chocolat (Chocolate Mousse)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115760"}, {"umap_0": -0.7767975330352783, "umap_1": 4.922647953033447, "region": "Italian", "recipe": "Simple Italian Grilled Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119267"}, {"umap_0": -0.5491448640823364, "umap_1": 6.249672889709473, "region": "Greek", "recipe": "Bourekakia Tiri - Cheese Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110772"}, {"umap_0": 9.880678176879883, "umap_1": 7.899832248687744, "region": "Australian", "recipe": "Apple and Golden Syrup Pudding (Australia)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74476"}, {"umap_0": -1.6649576425552368, "umap_1": 6.250842571258545, "region": "Italian", "recipe": "Manicotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125410"}, {"umap_0": 1.249767541885376, "umap_1": 8.422239303588867, "region": "UK", "recipe": "Scottish Meat Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102204"}, {"umap_0": 0.7794864773750305, "umap_1": 9.746917724609375, "region": "Mexican", "recipe": "Mexican Rice Bean Veggie Thing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78491"}, {"umap_0": 0.9988527297973633, "umap_1": 10.760143280029297, "region": "Thai", "recipe": "Curry Chicken With Couscous and Acorn Squash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58081"}, {"umap_0": 9.439095497131348, "umap_1": 5.956291198730469, "region": "Mexican", "recipe": "Sopapillas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6369"}, {"umap_0": 0.055145878344774246, "umap_1": 9.339345932006836, "region": "Indian Subcontinent", "recipe": "Chilled cucumber soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64267"}, {"umap_0": 2.9634106159210205, "umap_1": 10.344860076904297, "region": "Canadian", "recipe": "Cumin Oysters", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148391"}, {"umap_0": 0.5340229868888855, "umap_1": 8.57981014251709, "region": "French", "recipe": "Marinated Beef Tenderloin-Caramelized Onions-Mushroom Ragout", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113940"}, {"umap_0": -1.1305686235427856, "umap_1": 5.919745445251465, "region": "Italian", "recipe": "Easy Baked Rigatoni With Ground Beef and Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130399"}, {"umap_0": -1.6870341300964355, "umap_1": 10.471819877624512, "region": "Greek", "recipe": "It's Greek to Me Chicken Breasts (Somersize Level 1)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110284"}, {"umap_0": 1.446755051612854, "umap_1": 11.534807205200195, "region": "Scandinavian", "recipe": "Andreas Viestad's Pork Meatballs With Prunes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107118"}, {"umap_0": 3.281654119491577, "umap_1": 7.6903157234191895, "region": "French", "recipe": "Mean Chef's French Fries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116588"}, {"umap_0": 11.510458946228027, "umap_1": 7.283414840698242, "region": "Southeast Asian", "recipe": "Cassava Cake With Shredded Buco", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63564"}, {"umap_0": -0.24797332286834717, "umap_1": 11.071017265319824, "region": "Mexican", "recipe": "My Charra Bean Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82319"}, {"umap_0": 10.181310653686523, "umap_1": 8.254674911499023, "region": "Middle Eastern", "recipe": "Cream Kunafa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61825"}, {"umap_0": -0.42767879366874695, "umap_1": 9.853402137756348, "region": "Mexican", "recipe": "Best Ever Cilantro Corn Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7619"}, {"umap_0": 9.857704162597656, "umap_1": 5.650196075439453, "region": "US", "recipe": "Applesauce Cornbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16058"}, {"umap_0": 0.8255650401115417, "umap_1": 9.900074005126953, "region": "US", "recipe": "Home-Style Buffalo Wings", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17676"}, {"umap_0": 1.2974451780319214, "umap_1": 8.85756778717041, "region": "South American", "recipe": "Empanadas - Antioquia, Colombia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93640"}, {"umap_0": 2.7266793251037598, "umap_1": 9.715110778808594, "region": "Chinese and Mongolian", "recipe": "Foo Yung", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56304"}, {"umap_0": 0.9828050136566162, "umap_1": 7.126698970794678, "region": "Canadian", "recipe": "Vichyssoise Cream of Leek", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144540"}, {"umap_0": 2.73215913772583, "umap_1": 8.722335815429688, "region": "Caribbean", "recipe": "Slow-Cooker Jamaican Jerk BBQ Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91911"}, {"umap_0": -0.465935081243515, "umap_1": 7.947761058807373, "region": "Italian", "recipe": "Vegan Mushroom &amp; Bechamel Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129543"}, {"umap_0": 1.0105150938034058, "umap_1": 9.142107963562012, "region": "Indian Subcontinent", "recipe": "Stuffed Chatamari  (Nepali Stuffed Rice Crepes)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59667"}, {"umap_0": 0.16232894361019135, "umap_1": 8.142657279968262, "region": "US", "recipe": "Emma's Clam Chowder", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17688"}, {"umap_0": -1.5456469058990479, "umap_1": 8.884249687194824, "region": "Italian", "recipe": "Pasta and Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12626"}, {"umap_0": 1.2549721002578735, "umap_1": 12.993866920471191, "region": "Indian Subcontinent", "recipe": "Cucumber Cutlet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67416"}, {"umap_0": 1.59001886844635, "umap_1": 9.221657752990723, "region": "Belgian", "recipe": "Dutch Meat Loaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134186"}, {"umap_0": 0.3997994661331177, "umap_1": 6.454665184020996, "region": "UK", "recipe": "Cheddar Cheese Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104940"}, {"umap_0": -1.2533832788467407, "umap_1": 10.593985557556152, "region": "Greek", "recipe": "Lamb Kebabs With Yoghurt Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108899"}, {"umap_0": 2.177845001220703, "umap_1": 10.595561027526855, "region": "Spanish and Portuguese", "recipe": "Fruit Paella", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140494"}, {"umap_0": 0.8273208737373352, "umap_1": 6.018863201141357, "region": "Eastern European", "recipe": "Anita's Polish Pierogies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133373"}, {"umap_0": 0.7407148480415344, "umap_1": 5.560225963592529, "region": "Mexican", "recipe": "Mexican Grilled Corn on the Cob", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86996"}, {"umap_0": 3.3007805347442627, "umap_1": 9.967154502868652, "region": "Chinese and Mongolian", "recipe": "Home-Made Saucy Oriental Burger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57219"}, {"umap_0": 0.31244710087776184, "umap_1": 11.449378967285156, "region": "Rest Africa", "recipe": "Tunisian Eggah With Sausages", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47642"}, {"umap_0": 0.8983188271522522, "umap_1": 6.985197067260742, "region": "French", "recipe": "(Potatoes) Pommes Lyonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116109"}, {"umap_0": -0.3177778422832489, "umap_1": 5.548505783081055, "region": "Mexican", "recipe": "Santa Fe Salmon With Fresh Spinach Wild Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85375"}, {"umap_0": 3.1425113677978516, "umap_1": 9.617280960083008, "region": "Korean", "recipe": "Spicy Tofu Casserole with Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69120"}, {"umap_0": 1.70737624168396, "umap_1": 9.921626091003418, "region": "Canadian", "recipe": "Garlic Scape and Jalepeno Jelly", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142764"}, {"umap_0": -2.1506502628326416, "umap_1": 7.503776550292969, "region": "Italian", "recipe": "Olive Garden Pasta Salad Supremo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125976"}, {"umap_0": 0.03127273544669151, "umap_1": 6.576362609863281, "region": "Mexican", "recipe": "Mexican Rice Skillet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80970"}, {"umap_0": 3.0173182487487793, "umap_1": 9.560245513916016, "region": "Chinese and Mongolian", "recipe": "Pork and Vegetables in Black Bean Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52215"}, {"umap_0": 0.7770051956176758, "umap_1": 11.703926086425781, "region": "UK", "recipe": "Bridget Jones's Turkey Buffet Curry in a Hurry!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101339"}, {"umap_0": -2.450244188308716, "umap_1": 7.627440452575684, "region": "Italian", "recipe": "Spaghetti W- Fresh Tomato Sauce W- Capers &amp; Fresh Mozzarella", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130813"}, {"umap_0": 0.2627440094947815, "umap_1": 5.478451251983643, "region": "Deutschland", "recipe": "German Zwiebelkuchen (Onion Pie)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13305"}, {"umap_0": 11.089301109313965, "umap_1": 6.686619758605957, "region": "Eastern European", "recipe": "Hungarian Cheese-Filled Coffee Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105288"}, {"umap_0": -0.3958199620246887, "umap_1": 7.195949077606201, "region": "Irish", "recipe": "Dublin Bay Shrimp Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135374"}, {"umap_0": 0.08691467344760895, "umap_1": 4.500757694244385, "region": "Canadian", "recipe": "Cream Cheese Stuffed Mushroom caps", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148163"}, {"umap_0": 10.640205383300781, "umap_1": 7.689425468444824, "region": "Indian Subcontinent", "recipe": "Creme Caramels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69003"}, {"umap_0": 0.7567131519317627, "umap_1": 9.595846176147461, "region": "Mexican", "recipe": "Handsome Jack's Red Hot Red Wine Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83351"}, {"umap_0": 10.276200294494629, "umap_1": 5.612985610961914, "region": "Irish", "recipe": "Baileys Irish Cream Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136096"}, {"umap_0": 2.251246213912964, "umap_1": 9.811243057250977, "region": "Chinese and Mongolian", "recipe": "Chinesey Chicken Wings", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56203"}, {"umap_0": 1.5477447509765625, "umap_1": 4.980820178985596, "region": "Mexican", "recipe": "Nachos Supreme", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83411"}, {"umap_0": -0.9134964346885681, "umap_1": 10.421812057495117, "region": "US", "recipe": "Texas Coleslaw", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17495"}, {"umap_0": -1.0912203788757324, "umap_1": 5.120869159698486, "region": "Italian", "recipe": "Eggplant (Aubergine) Parmigiano", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123409"}, {"umap_0": 3.584340810775757, "umap_1": 9.676220893859863, "region": "Korean", "recipe": "Kimchi Jun (Kimchi Patty)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4785"}, {"umap_0": -1.1599665880203247, "umap_1": 9.975829124450684, "region": "Northern Africa", "recipe": "Moroccan Oven Roasted Sesame Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49560"}, {"umap_0": -0.25800904631614685, "umap_1": 5.884183406829834, "region": "Canadian", "recipe": "Brussels Sprouts Gratin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147828"}, {"umap_0": 8.855896949768066, "umap_1": 8.17015266418457, "region": "Australian", "recipe": "Carob Fudge", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73449"}, {"umap_0": 1.718493938446045, "umap_1": 7.4463887214660645, "region": "Scandinavian", "recipe": "Pressure Cooker Meatballs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139688"}, {"umap_0": -0.054099734872579575, "umap_1": 8.034754753112793, "region": "US", "recipe": "Manhattan Filet with Pan Sauce Bordelaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17641"}, {"umap_0": 8.088932037353516, "umap_1": 6.6288042068481445, "region": "Scandinavian", "recipe": "Knaakkebrod or Nakkileipa (Scandinavian-Style Rye Crisp Bread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13432"}, {"umap_0": -0.5568294525146484, "umap_1": 6.193646430969238, "region": "Italian", "recipe": "Meat and Spinach Ravioli Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11937"}, {"umap_0": 7.752982139587402, "umap_1": 6.848856449127197, "region": "Italian", "recipe": "Easy Maple-Proofed Pizza Dough for One", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125669"}, {"umap_0": 3.9718432426452637, "umap_1": 9.385099411010742, "region": "Japanese", "recipe": "Chilled Soba Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71199"}, {"umap_0": 3.336444139480591, "umap_1": 9.949758529663086, "region": "Chinese and Mongolian", "recipe": "Baked Pork Ribs With Hoisin Barbecue Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56402"}, {"umap_0": -2.5576043128967285, "umap_1": 8.035008430480957, "region": "Italian", "recipe": "Tomato and Mozzarella Bites", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11445"}, {"umap_0": 8.521719932556152, "umap_1": 6.278942108154297, "region": "Eastern European", "recipe": "Buckwheat Pancakes Russian Blini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100774"}, {"umap_0": 1.4134681224822998, "umap_1": 5.7431840896606445, "region": "Mexican", "recipe": "Baked Mexican Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80325"}, {"umap_0": 10.530927658081055, "umap_1": 5.956060409545898, "region": "Belgian", "recipe": "The Best Almond Paste Bars", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133978"}, {"umap_0": 10.893014907836914, "umap_1": 7.137414455413818, "region": "French", "recipe": "French Chocolate Cupcakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113828"}, {"umap_0": 0.47363466024398804, "umap_1": 5.820273399353027, "region": "Irish", "recipe": "COLCANNON", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134767"}, {"umap_0": -1.0735938549041748, "umap_1": 6.248915672302246, "region": "Deutschland", "recipe": "Swiss Chard Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117952"}, {"umap_0": 1.4559935331344604, "umap_1": 10.245786666870117, "region": "Thai", "recipe": "Coconut Curry Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58030"}, {"umap_0": -1.971341609954834, "umap_1": 8.966514587402344, "region": "Rest Africa", "recipe": "Turkish Chicken Kebabs With Aleppo Pepper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48375"}, {"umap_0": 1.8515671491622925, "umap_1": 4.967075824737549, "region": "Mexican", "recipe": "Leftover Steak Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80079"}, {"umap_0": 1.4935951232910156, "umap_1": 9.735342979431152, "region": "Chinese and Mongolian", "recipe": "Chinese Plum Chicken Cutlets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55948"}, {"umap_0": 10.910282135009766, "umap_1": 6.873297691345215, "region": "Canadian", "recipe": "Peanut Brittle Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144614"}, {"umap_0": -1.534580945968628, "umap_1": 7.149397850036621, "region": "Italian", "recipe": "Balsamic Tofu or Chicken and Bright Veggie Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124278"}, {"umap_0": -0.6063388586044312, "umap_1": 8.344342231750488, "region": "Chinese and Mongolian", "recipe": "Steamed Fish With Cabbage, Mushrooms and Chinese Black Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56230"}, {"umap_0": -0.9128854274749756, "umap_1": 9.804506301879883, "region": "Mexican", "recipe": "Black Bean and Corn Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77955"}, {"umap_0": 4.03208065032959, "umap_1": 10.294976234436035, "region": "Japanese", "recipe": "Oshitashi (Japanese Spinach Salad With Roasted Sesame)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70662"}, {"umap_0": 10.753658294677734, "umap_1": 7.548896789550781, "region": "Canadian", "recipe": "Napoleons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146919"}, {"umap_0": 10.334403038024902, "umap_1": 10.143391609191895, "region": "Australian", "recipe": "Applemint Iced Tea", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72702"}, {"umap_0": 0.32953765988349915, "umap_1": 9.645954132080078, "region": "Central American", "recipe": "Dobladas(Guatemala)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93144"}, {"umap_0": 0.535389244556427, "umap_1": 13.254276275634766, "region": "Rest Africa", "recipe": "Spicy African Preserved Limes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49363"}, {"umap_0": -1.6463419198989868, "umap_1": 10.436721801757812, "region": "Irish", "recipe": "Terra's Colcannon (Courtesy of Spiral Path Farms Csa)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134671"}, {"umap_0": 10.580803871154785, "umap_1": 6.451938152313232, "region": "French", "recipe": "Fondant Aux Poires Et Au Gingembre (Choc Ginger Pear Fudge Cake)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112387"}, {"umap_0": 10.099893569946289, "umap_1": 6.042366027832031, "region": "Scandinavian", "recipe": "Brown Swedish Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105986"}, {"umap_0": -0.8276249170303345, "umap_1": 9.073336601257324, "region": "UK", "recipe": "Spicy Spaghetti With Garlic Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102796"}, {"umap_0": 1.7465989589691162, "umap_1": 10.382780075073242, "region": "Southeast Asian", "recipe": "Filipino Beef Kabobs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63719"}, {"umap_0": 1.919236660003662, "umap_1": 9.967972755432129, "region": "Southeast Asian", "recipe": "Caramelized Pork Banh Mi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62129"}, {"umap_0": 11.071337699890137, "umap_1": 7.889893531799316, "region": "UK", "recipe": "Fresh Lime Pie (British Virgin Islands)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105061"}, {"umap_0": 0.3033906817436218, "umap_1": 7.623064041137695, "region": "Italian", "recipe": "My Best Chicken Piccata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11601"}, {"umap_0": 0.33403506875038147, "umap_1": 12.540398597717285, "region": "Indian Subcontinent", "recipe": "Indian Eggplant Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67461"}, {"umap_0": 0.24190753698349, "umap_1": 11.79411506652832, "region": "Rest Africa", "recipe": "Vegetarian Fig, Olive and Chickpea Tagine", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47690"}, {"umap_0": 0.33403292298316956, "umap_1": 9.166858673095703, "region": "Middle Eastern", "recipe": "Egyptian Bamia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2612"}, {"umap_0": 1.044874906539917, "umap_1": 6.875057220458984, "region": "Canadian", "recipe": "Something Like Mashed Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147326"}, {"umap_0": -1.742253303527832, "umap_1": 7.737472057342529, "region": "Italian", "recipe": "Penne Rigate With Prosciutto, and Snow Peas in a Truffled Cream", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123314"}, {"umap_0": 1.1833767890930176, "umap_1": 7.128292560577393, "region": "Irish", "recipe": "Creamed Cabbage and Carrots", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135907"}, {"umap_0": 0.009266161359846592, "umap_1": 8.048189163208008, "region": "South American", "recipe": "Danish Boiled Beef With Sweet and Sour Horseradish Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95063"}, {"umap_0": 0.984284520149231, "umap_1": 12.958019256591797, "region": "Indian Subcontinent", "recipe": "Palak Panir (Saag Panir)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67577"}, {"umap_0": 11.241716384887695, "umap_1": 8.980339050292969, "region": "UK", "recipe": "Cranachan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101641"}, {"umap_0": -0.7338122725486755, "umap_1": 6.916945457458496, "region": "Australian", "recipe": "Cheese, Onion, Leek &amp; Potato Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76429"}, {"umap_0": 0.7410939931869507, "umap_1": 9.992947578430176, "region": "Indian Subcontinent", "recipe": "Chicken Thighs with Lime and Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64429"}, {"umap_0": 2.165865182876587, "umap_1": 11.132843971252441, "region": "Southeast Asian", "recipe": "Fried Chilli Sambal", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59739"}, {"umap_0": 11.67033576965332, "umap_1": 9.134739875793457, "region": "Mexican", "recipe": "Pound Cake With Dulce De Leche Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83950"}, {"umap_0": -1.00088369846344, "umap_1": 10.995707511901855, "region": "Middle Eastern", "recipe": "Lime Soup With Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61810"}, {"umap_0": -2.7427170276641846, "umap_1": 9.340370178222656, "region": "Middle Eastern", "recipe": "Jungle Juice (Cucumber and Tomato Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63246"}, {"umap_0": 1.8458852767944336, "umap_1": 6.870546817779541, "region": "Italian", "recipe": "Italian Roast Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124994"}, {"umap_0": -2.3552191257476807, "umap_1": 9.478193283081055, "region": "Indian Subcontinent", "recipe": "Indian Mayonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68954"}, {"umap_0": 9.956003189086914, "umap_1": 7.380596160888672, "region": "French", "recipe": "Apple-Raspberry Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114179"}, {"umap_0": 0.6948102712631226, "umap_1": 9.551872253417969, "region": "Mexican", "recipe": "garlic shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81019"}, {"umap_0": -1.1323838233947754, "umap_1": 6.985578536987305, "region": "Italian", "recipe": "Quick Spaghetti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123732"}, {"umap_0": -0.007444669492542744, "umap_1": 11.888574600219727, "region": "Indian Subcontinent", "recipe": "Pork Curry ( for the Ladies)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65863"}, {"umap_0": 1.3530242443084717, "umap_1": 10.043692588806152, "region": "Mexican", "recipe": "Spicy Mexican Fish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81094"}, {"umap_0": 10.996036529541016, "umap_1": 8.818633079528809, "region": "UK", "recipe": "Tipsy Jello Pudding Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101640"}, {"umap_0": 1.069703221321106, "umap_1": 8.941648483276367, "region": "US", "recipe": "Hopi Corn Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17602"}, {"umap_0": 8.756756782531738, "umap_1": 7.521526336669922, "region": "Deutschland", "recipe": "Omelette (Pancakes-Crepes)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117650"}, {"umap_0": 0.32673758268356323, "umap_1": 9.952874183654785, "region": "Mexican", "recipe": "Chipotle Chicken Taco Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88862"}, {"umap_0": -0.43960681557655334, "umap_1": 12.580408096313477, "region": "Greek", "recipe": "Greek Blend Spice Mix", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109732"}, {"umap_0": 9.864283561706543, "umap_1": 6.54950475692749, "region": "Canadian", "recipe": "Apple or Pear Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145981"}, {"umap_0": -1.075528860092163, "umap_1": 5.378075122833252, "region": "Italian", "recipe": "Easy Focaccia Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11023"}, {"umap_0": -1.2162936925888062, "umap_1": 11.565125465393066, "region": "Mexican", "recipe": "Chunky Guacamole - Bobby Flay", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80217"}, {"umap_0": -1.6309967041015625, "umap_1": 12.115493774414062, "region": "Mexican", "recipe": "Elegant Guacamole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78670"}, {"umap_0": 11.894551277160645, "umap_1": 10.133395195007324, "region": "Italian", "recipe": "Frangelico Chocolate Martini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124222"}, {"umap_0": 0.09613540023565292, "umap_1": 10.27182674407959, "region": "Caribbean", "recipe": "Fruity and Spicy Appetizers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90082"}, {"umap_0": 12.214362144470215, "umap_1": 7.9051032066345215, "region": "Chinese and Mongolian", "recipe": "Rice Crispy Treats", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54423"}, {"umap_0": 2.190314769744873, "umap_1": 6.638415336608887, "region": "South American", "recipe": "Jimmy's Cachapas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94049"}, {"umap_0": 9.523773193359375, "umap_1": 7.86791467666626, "region": "Australian", "recipe": "Kumara &amp; Corn Fritters", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76864"}, {"umap_0": 2.7804222106933594, "umap_1": 9.667583465576172, "region": "Chinese and Mongolian", "recipe": "Chicken Walnut Stir Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52861"}, {"umap_0": 11.62756061553955, "umap_1": 8.036343574523926, "region": "French", "recipe": "Chocolate Pudding French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115504"}, {"umap_0": -1.7413356304168701, "umap_1": 8.746635437011719, "region": "French", "recipe": "All-In-One Ni\u00e7oise Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113381"}, {"umap_0": 0.9104582071304321, "umap_1": 8.765548706054688, "region": "South American", "recipe": "Fresh Spinach With Ground Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97525"}, {"umap_0": 10.223584175109863, "umap_1": 9.447748184204102, "region": "Italian", "recipe": "Italian Haroset-Charoset for Passover", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126729"}, {"umap_0": 3.0466790199279785, "umap_1": 8.960543632507324, "region": "Spanish and Portuguese", "recipe": "The Lazy Cook's Spanish Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141881"}, {"umap_0": -1.9119269847869873, "umap_1": 7.893063068389893, "region": "Greek", "recipe": "Couscous Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108756"}, {"umap_0": -2.0197086334228516, "umap_1": 7.073923110961914, "region": "Italian", "recipe": "Carrabba's Cavatappi Amatriciana (Side Dish Pasta)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123626"}, {"umap_0": 1.7798981666564941, "umap_1": 8.77719783782959, "region": "Southeast Asian", "recipe": "Pickled Mustard Green (vietnamese Kimchi)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62115"}, {"umap_0": 9.193370819091797, "umap_1": 6.022936820983887, "region": "Irish", "recipe": "Irish Soda Bread (Co.Cavan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135670"}, {"umap_0": 11.39769458770752, "umap_1": 8.023594856262207, "region": "UK", "recipe": "English Muffins Topped With Bananas and Cinnamon Sugar.", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104176"}, {"umap_0": 1.29863440990448, "umap_1": 7.1696977615356445, "region": "Deutschland", "recipe": "Jaegerkohl (Hunter's Cabbage)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138382"}, {"umap_0": 0.5709402561187744, "umap_1": 12.530423164367676, "region": "Indian Subcontinent", "recipe": "Bombay Lamb Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67238"}, {"umap_0": 0.7810654044151306, "umap_1": 8.148824691772461, "region": "Eastern European", "recipe": "Krupnik (Polish Vegetable Barley Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100124"}, {"umap_0": 10.818602561950684, "umap_1": 7.5809855461120605, "region": "US", "recipe": "Mock Cherry Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14407"}, {"umap_0": -2.6787431240081787, "umap_1": 6.866921424865723, "region": "Italian", "recipe": "The Classic Pesto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132900"}, {"umap_0": 1.6391514539718628, "umap_1": 8.039920806884766, "region": "Scandinavian", "recipe": "Curried Eggs on Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139525"}, {"umap_0": 0.5552326440811157, "umap_1": 8.1549072265625, "region": "Canadian", "recipe": "Maple Pork and Peach Skewers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146581"}, {"umap_0": 1.9352266788482666, "umap_1": 5.161035060882568, "region": "Italian", "recipe": "Low Fat Italian Chili Rellenos Wrap", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130299"}, {"umap_0": -1.4051463603973389, "umap_1": 4.84638786315918, "region": "Italian", "recipe": "Italian Sausage Florentine for the Crock Pot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128673"}, {"umap_0": -2.1275382041931152, "umap_1": 6.559417724609375, "region": "Italian", "recipe": "Baked Vegetable Gratin (Verdure Al Forno)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120365"}, {"umap_0": 2.1892921924591064, "umap_1": 10.958094596862793, "region": "Southeast Asian", "recipe": "Rice Noodle Salad With Vietnamese Shrimp", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62374"}, {"umap_0": 11.588420867919922, "umap_1": 6.772088527679443, "region": "French", "recipe": "Speedy French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113766"}, {"umap_0": -0.5309041142463684, "umap_1": 8.716485977172852, "region": "South American", "recipe": "Pressure Cooker Beef Barley Vegetable Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97849"}, {"umap_0": 9.52021598815918, "umap_1": 6.072023391723633, "region": "Indian Subcontinent", "recipe": "Royal Cardamom Treats", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68749"}, {"umap_0": -2.3790202140808105, "umap_1": 9.095721244812012, "region": "Italian", "recipe": "Green Olivida Crostini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132323"}, {"umap_0": 11.441635131835938, "umap_1": 6.714810848236084, "region": "Deutschland", "recipe": "Coconut Pecan Frosting for German Chocolate Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137020"}, {"umap_0": 11.427209854125977, "umap_1": 8.777311325073242, "region": "Irish", "recipe": "Oreo Irish Liqueur Cheesecake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136649"}, {"umap_0": 1.168131709098816, "umap_1": 5.304154396057129, "region": "Mexican", "recipe": "Healthier Chicken Enchiladas II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6898"}, {"umap_0": -0.629272997379303, "umap_1": 10.622817993164062, "region": "Rest Africa", "recipe": "Veggie-Loaded Side Dish Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51461"}, {"umap_0": -0.8927010893821716, "umap_1": 11.425787925720215, "region": "Northern Africa", "recipe": "Moroccan Chicken (Tagine) - Atk", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49913"}, {"umap_0": 10.098462104797363, "umap_1": 7.098648548126221, "region": "Middle Eastern", "recipe": "Mandarin Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60118"}, {"umap_0": 10.095542907714844, "umap_1": 9.283376693725586, "region": "Belgian", "recipe": "Dutch Apple Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134418"}, {"umap_0": 10.572176933288574, "umap_1": 8.129974365234375, "region": "Canadian", "recipe": "Cranberry Apple Tart", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143296"}, {"umap_0": -0.6531105041503906, "umap_1": 6.455648899078369, "region": "Italian", "recipe": "Torta Di Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122076"}, {"umap_0": 1.5893229246139526, "umap_1": 8.596124649047852, "region": "Middle Eastern", "recipe": "Persian Cutlet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60101"}, {"umap_0": -2.709214687347412, "umap_1": 7.584137916564941, "region": "Greek", "recipe": "Yummy Greek Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107741"}, {"umap_0": -0.46193552017211914, "umap_1": 8.980504989624023, "region": "Mexican", "recipe": "Mexican Vegetable Caldo Verde Con Pollo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89042"}, {"umap_0": 3.207378625869751, "umap_1": 9.008267402648926, "region": "Chinese and Mongolian", "recipe": "Fat Free egg rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57168"}, {"umap_0": 10.267742156982422, "umap_1": 7.76490592956543, "region": "French", "recipe": "French Toast With Rum Bananas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114430"}, {"umap_0": -1.9979051351547241, "umap_1": 9.070948600769043, "region": "French", "recipe": "Creamy Saffron Mussels", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115667"}, {"umap_0": -2.2850003242492676, "umap_1": 8.51480770111084, "region": "Middle Eastern", "recipe": "Kisir", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69946"}, {"umap_0": -0.28104421496391296, "umap_1": 11.630274772644043, "region": "Indian Subcontinent", "recipe": "Coriander, Mint and Green Chilli Butter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67713"}, {"umap_0": -0.7649892568588257, "umap_1": 12.151599884033203, "region": "Mexican", "recipe": "Jalapeno Spread-Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78884"}, {"umap_0": 0.7143282294273376, "umap_1": 8.882909774780273, "region": "Middle Eastern", "recipe": "Greek Stuffed Zucchini (Halal)(Kolokithakia Yemista)(Gluten Free", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70129"}, {"umap_0": 0.7959996461868286, "umap_1": 5.863668441772461, "region": "Irish", "recipe": "Sheboygan Beer and Brat Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136010"}, {"umap_0": -1.67689847946167, "umap_1": 6.733006477355957, "region": "Greek", "recipe": "Mediterranean Grilled Chicken Fingers #RSC", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109749"}, {"umap_0": -0.688520073890686, "umap_1": 10.743539810180664, "region": "Greek", "recipe": "The Bean Soup Full of Choices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108434"}, {"umap_0": -0.9593788981437683, "umap_1": 10.908846855163574, "region": "Caribbean", "recipe": "Chickpea Fritters With Hot Pepper Mayonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90192"}, {"umap_0": 8.29439926147461, "umap_1": 6.031138896942139, "region": "Scandinavian", "recipe": "Simple Swedish Pastry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105883"}, {"umap_0": 1.8915190696716309, "umap_1": 5.487430572509766, "region": "Mexican", "recipe": "Enchiladas Mona Luisa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84940"}, {"umap_0": 0.531986653804779, "umap_1": 12.960428237915039, "region": "Indian Subcontinent", "recipe": "Ontario Lamb Tikka Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64061"}, {"umap_0": -2.460339307785034, "umap_1": 8.003098487854004, "region": "Italian", "recipe": "Italian Salad With Balsamic Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120487"}, {"umap_0": 10.026773452758789, "umap_1": 9.46152400970459, "region": "Deutschland", "recipe": "Summer Fruit Borscht", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138115"}, {"umap_0": -0.9145243167877197, "umap_1": 6.565636157989502, "region": "Italian", "recipe": "Rose's Parmesan-Breaded Fried Zucchini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123618"}, {"umap_0": -0.12286669760942459, "umap_1": 11.758695602416992, "region": "Rest Africa", "recipe": "Doro Wat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51599"}, {"umap_0": -1.552089810371399, "umap_1": 8.917481422424316, "region": "Middle Eastern", "recipe": "Shawarma Djaj -- Chicken Shawarma (Lebanon -- Middle East)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63040"}, {"umap_0": 0.021580174565315247, "umap_1": 5.3114848136901855, "region": "French", "recipe": "Gratin Potatoes Dauphinois", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115316"}, {"umap_0": 11.945073127746582, "umap_1": 8.34962272644043, "region": "Mexican", "recipe": "S\u2019mores Quesadilla", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/85063"}, {"umap_0": 2.177293539047241, "umap_1": 6.099184989929199, "region": "Spanish and Portuguese", "recipe": "Mountain Rice (Arroz Da Serra) - Portuguese Brazil", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118679"}, {"umap_0": 10.135943412780762, "umap_1": 7.112623691558838, "region": "Eastern European", "recipe": "Hungarian Cherry Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105527"}, {"umap_0": 1.49203360080719, "umap_1": 9.381569862365723, "region": "Deutschland", "recipe": "Slow-Cooked German Short Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137459"}, {"umap_0": 10.624743461608887, "umap_1": 9.374702453613281, "region": "Australian", "recipe": "Berry Watermelon Smoothie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76166"}, {"umap_0": -1.5848662853240967, "umap_1": 10.166808128356934, "region": "Mexican", "recipe": "Nopales Con Pico De Gallo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83640"}, {"umap_0": -0.5579593181610107, "umap_1": 9.315838813781738, "region": "Spanish and Portuguese", "recipe": "El Bullitt ( Portuguese Vegetable Hot-Pot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119067"}, {"umap_0": -0.2012803554534912, "umap_1": 11.142373085021973, "region": "Middle Eastern", "recipe": "Baked Couscous with Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61619"}, {"umap_0": 10.187808990478516, "umap_1": 6.482682704925537, "region": "UK", "recipe": "Gingerbread With Pecans and Raisins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104309"}, {"umap_0": 11.59180736541748, "umap_1": 7.891847610473633, "region": "Eastern European", "recipe": "Russian Cheese Piroshki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100604"}, {"umap_0": -0.9312697052955627, "umap_1": 3.9964046478271484, "region": "Italian", "recipe": "My Favorite Easy Lasagna Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129192"}, {"umap_0": 1.3099995851516724, "umap_1": 5.397132396697998, "region": "Mexican", "recipe": "La La La Bamba! Mexican Dip Ole'!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82140"}, {"umap_0": -1.265501856803894, "umap_1": 8.323101997375488, "region": "Italian", "recipe": "Costolette Di Maiale Con Salvia (Pork Chops With Sage)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121782"}, {"umap_0": -0.1634191870689392, "umap_1": 9.486566543579102, "region": "Chinese and Mongolian", "recipe": "Dominican Moro De Habichuelas Negras  ( Rice and Beans)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53424"}, {"umap_0": -0.3582545816898346, "umap_1": 5.137890338897705, "region": "Italian", "recipe": "Cream of Mushroom Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129270"}, {"umap_0": 0.18689262866973877, "umap_1": 7.478733539581299, "region": "French", "recipe": "French Chicken [with Linguine]", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113907"}, {"umap_0": 2.126091241836548, "umap_1": 10.938815116882324, "region": "Thai", "recipe": "Thai-Style Grapefruit and Prawn", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3356"}, {"umap_0": -0.8398703336715698, "umap_1": 7.191639423370361, "region": "Australian", "recipe": "Toadstool Salad (For Kids!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74859"}, {"umap_0": -1.4928706884384155, "umap_1": 8.687613487243652, "region": "Italian", "recipe": "Vegan Eggplant Rollatini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123182"}, {"umap_0": 1.1508342027664185, "umap_1": 6.171228885650635, "region": "Mexican", "recipe": "Pico de gallo shrimp salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81349"}, {"umap_0": 0.9417232871055603, "umap_1": 12.73490047454834, "region": "Indian Subcontinent", "recipe": "Chicken in Yogurt and Almond Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65418"}, {"umap_0": 0.8187896609306335, "umap_1": 5.6959028244018555, "region": "Italian", "recipe": "Best Italian Alfredo Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123689"}, {"umap_0": 9.684968948364258, "umap_1": 8.2518892288208, "region": "Mexican", "recipe": "Pan De Meurete (Bread of the Dead) or Sugar Skull Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87849"}, {"umap_0": -1.1702262163162231, "umap_1": 11.809029579162598, "region": "Mexican", "recipe": "Pico De Gallo Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83470"}, {"umap_0": 10.294347763061523, "umap_1": 9.280075073242188, "region": "Canadian", "recipe": "Plum and Apple Fool", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144624"}, {"umap_0": 10.91801643371582, "umap_1": 10.622156143188477, "region": "Mexican", "recipe": "Ginger Margaritas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84748"}, {"umap_0": -1.1414788961410522, "umap_1": 6.660133361816406, "region": "Italian", "recipe": "Adriana's Pasta With Sausage, Cannellini Bean, and Broccoli Rabe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126469"}, {"umap_0": 1.0549054145812988, "umap_1": 8.457744598388672, "region": "Mexican", "recipe": "Red Bean Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82930"}, {"umap_0": 1.85953688621521, "umap_1": 10.063806533813477, "region": "Chinese and Mongolian", "recipe": "Chinese Flank Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53244"}, {"umap_0": 0.11703982949256897, "umap_1": 4.702688217163086, "region": "Canadian", "recipe": "Mom's Cheesy Potatoe Caserole #5FIX", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143349"}, {"umap_0": 2.488203525543213, "umap_1": 10.506277084350586, "region": "Thai", "recipe": "Thai Braised Pork Ribs (Pressure Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57705"}, {"umap_0": 1.4033921957015991, "umap_1": 12.588260650634766, "region": "Indian Subcontinent", "recipe": "Akki Rotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4639"}, {"umap_0": -1.7084037065505981, "umap_1": 10.002152442932129, "region": "Scandinavian", "recipe": "Mustard Dill Sauce - to Serve With Salmon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139638"}, {"umap_0": 0.48281338810920715, "umap_1": 5.531712055206299, "region": "Deutschland", "recipe": "Krautburger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13377"}, {"umap_0": 2.268494129180908, "umap_1": 10.031625747680664, "region": "Rest Africa", "recipe": "Cornflake-crumbed fish with spinach", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49303"}, {"umap_0": 1.3866162300109863, "umap_1": 9.900384902954102, "region": "Chinese and Mongolian", "recipe": "Waldorf Brown Rice Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53991"}, {"umap_0": 10.918259620666504, "umap_1": 7.958308219909668, "region": "Deutschland", "recipe": "Boysenberry-Filled Sandwich Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118593"}, {"umap_0": 11.057143211364746, "umap_1": 9.138713836669922, "region": "French", "recipe": "French Chocolate Bark", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112089"}, {"umap_0": 0.7429876327514648, "umap_1": 12.343132972717285, "region": "Indian Subcontinent", "recipe": "Chicken Korma - Takeaway Style!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64071"}, {"umap_0": 1.3051496744155884, "umap_1": 7.397836685180664, "region": "South American", "recipe": "Beef Stroganoff Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96858"}, {"umap_0": -1.5075063705444336, "umap_1": 12.006537437438965, "region": "Mexican", "recipe": "Creamy Mexican Green Salsa-Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89791"}, {"umap_0": 3.148944854736328, "umap_1": 8.9029541015625, "region": "Chinese and Mongolian", "recipe": "Sue Gow", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52557"}, {"umap_0": -0.12715567648410797, "umap_1": 10.932998657226562, "region": "Indian Subcontinent", "recipe": "Garlic Mushrooms With Spinach Raita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68770"}, {"umap_0": 3.5829391479492188, "umap_1": 9.868836402893066, "region": "Australian", "recipe": "Sesame Pasta Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5296"}, {"umap_0": -1.289717674255371, "umap_1": 5.662283897399902, "region": "US", "recipe": "Creamy Jambalaya Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14973"}, {"umap_0": 1.046501636505127, "umap_1": 12.69597053527832, "region": "Indian Subcontinent", "recipe": "Smoked Seekh Kababs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64135"}, {"umap_0": 9.486129760742188, "umap_1": 7.297597885131836, "region": "Indian Subcontinent", "recipe": "Apple Jalebi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66925"}, {"umap_0": -0.9925557374954224, "umap_1": 9.947272300720215, "region": "Canadian", "recipe": "Cabbage-Zucchini Stir Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146163"}, {"umap_0": 0.3792119026184082, "umap_1": 4.707038879394531, "region": "Deutschland", "recipe": "Easy German Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138392"}]}};
var embedOpt = {"mode": "vega-lite"};

function showError(el, error){
el.innerHTML = ('&lt;div class="error" style="color:red;"&gt;'
  + '&lt;p&gt;JavaScript Error: ' + error.message + '&lt;/p&gt;'
  + "&lt;p&gt;This usually means there's a typo in your chart specification. "
  + "See the javascript console for the full traceback.&lt;/p&gt;"
  + '&lt;/div&gt;');
throw error;
}
const el = document.getElementById('vis-1');
vegaEmbed("#vis-1", spec, embedOpt)
.catch(error =&gt; showError(el, error));
})(vegaEmbed);

&lt;/script&gt;

&lt;p&gt;Remember, the closer two recipes are, the more similar they &lt;em&gt;should&lt;/em&gt; be. It looks like it. Notice how recipes from the same region tend to form groups. This makes sense: recipes of the same region use similar sets of ingredients.&lt;/p&gt;
&lt;h2 id="i-dont-like-spinach"&gt;I don't like spinach&lt;/h2&gt;
&lt;p&gt;While studying recommendation systems. I &lt;a href="http://charuaggarwal.net/Recommender-Systems.htm"&gt;came across&lt;/a&gt; a &lt;em&gt;particularly&lt;/em&gt; interesting technique for building recommender systems.&lt;/p&gt;
&lt;p&gt;The first step is to create a network of items (recipes in our case). This includes the recipes that you like, and recipes that you don't like. Using pandas, let's define that:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# let&amp;#39;s define some polarising ingredients&lt;/span&gt;

&lt;span class="n"&gt;liked_ingredients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;cheese&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;pizza&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;avocado&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;olive oil&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;hated_ingredients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;spinach&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;mushroom&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;olive&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;beef&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;truffle&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;blue cheese&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# let&amp;#39;s add a &amp;quot;like&amp;quot; property&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;like&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;# and populate our data&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ingredient_string&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;|&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hated_ingredients&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;like&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ingredient_string&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;|&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;liked_ingredients&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;like&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Yeah, I don't like spinach. Let me be.&lt;/p&gt;
&lt;p&gt;Once I've defined what recipes I like, and what recipes I don't, we can use our leverage our embeddings again, to visualise &lt;em&gt;my own&lt;/em&gt; network of recipes. Here's a sample of 500:&lt;/p&gt;
&lt;div id="vis"&gt;&lt;/div&gt;
&lt;script&gt;
(function(vegaEmbed) {
var spec = {"config": {"view": {"continuousWidth": 400, "continuousHeight": 300}}, "data": {"name": "data-a48bde4a404c70bcc159550b325d9243"}, "mark": "point", "encoding": {"color": {"field": "like", "scale": {"domain": [0.0, 1.0, -1.0], "range": ["gray", "green", "red"]}, "type": "nominal"}, "tooltip": [{"field": "recipe", "type": "nominal"}, {"field": "region", "type": "nominal"}, {"field": "url", "type": "nominal"}], "x": {"field": "umap_0", "type": "quantitative"}, "y": {"field": "umap_1", "type": "quantitative"}}, "height": 500, "selection": {"selector002": {"type": "interval", "bind": "scales", "encodings": ["x", "y"]}}, "title": "Visualizing 500 recipes", "width": 500, "$schema": "https://vega.github.io/schema/vega-lite/v4.17.0.json", "datasets": {"data-a48bde4a404c70bcc159550b325d9243": [{"umap_0": -1.1850786209106445, "umap_1": 6.134487152099609, "region": "Italian", "recipe": "Crock Pot Spaghetti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129855", "like": -1}, {"umap_0": -1.1895201206207275, "umap_1": 8.653212547302246, "region": "Eastern European", "recipe": "Pate Iz Shchuki -Baked Salmon Mousse", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100310", "like": 0}, {"umap_0": -0.5190680027008057, "umap_1": 6.319146156311035, "region": "Canadian", "recipe": "Spanish stuffed meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148887", "like": 1}, {"umap_0": 9.447676658630371, "umap_1": 6.079851150512695, "region": "Canadian", "recipe": "Kefir Jam Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145871", "like": 0}, {"umap_0": -2.2245516777038574, "umap_1": 8.895587921142578, "region": "Australian", "recipe": "Red Wine Steak and Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75932", "like": 1}, {"umap_0": 1.0995079278945923, "umap_1": 5.980602741241455, "region": "South American", "recipe": "Hot Mexican Beef Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96342", "like": 1}, {"umap_0": -0.22713147103786469, "umap_1": 8.200007438659668, "region": "Canadian", "recipe": "Apple, Carrot Pineapple Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148164", "like": 0}, {"umap_0": 3.311671257019043, "umap_1": 9.73155403137207, "region": "Korean", "recipe": "Japchae", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4723", "like": -1}, {"umap_0": 10.302436828613281, "umap_1": 10.304679870605469, "region": "US", "recipe": "Fresh Raspberry Mojito", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17745", "like": 0}, {"umap_0": 3.8760950565338135, "umap_1": 10.024311065673828, "region": "Southeast Asian", "recipe": "Sesame Eggplant (Aubergine) Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59850", "like": 0}, {"umap_0": 0.3885427713394165, "umap_1": 8.351722717285156, "region": "Australian", "recipe": "Lemon Scented Lamb", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75969", "like": 0}, {"umap_0": 1.5670115947723389, "umap_1": 10.798117637634277, "region": "Australian", "recipe": "Spicy Apricot Lamb Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76663", "like": 0}, {"umap_0": 9.558920860290527, "umap_1": 6.659140110015869, "region": "Indian Subcontinent", "recipe": "Kashmiri Naan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65649", "like": 0}, {"umap_0": 1.9093105792999268, "umap_1": 5.1884894371032715, "region": "Mexican", "recipe": "Lailani's Taco Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78878", "like": 1}, {"umap_0": 9.506790161132812, "umap_1": 6.551031112670898, "region": "Canadian", "recipe": "Healthy Oatmeal Yogurt Bran Muffins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147918", "like": 0}, {"umap_0": 8.961240768432617, "umap_1": 7.516850471496582, "region": "Middle Eastern", "recipe": "Megli Popsicles (Lebanese)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63118", "like": 0}, {"umap_0": 2.4023356437683105, "umap_1": 10.893607139587402, "region": "Thai", "recipe": "Khao Soi Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3176", "like": 0}, {"umap_0": -1.3299518823623657, "umap_1": 9.10419750213623, "region": "Middle Eastern", "recipe": "Bazella (Middle Eastern-Style Beef Casserole)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61623", "like": 1}, {"umap_0": -1.29935884475708, "umap_1": 7.717833042144775, "region": "Italian", "recipe": "Fast 'n Easy FLIPPIN GOOD CHICKEN!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122078", "like": 1}, {"umap_0": 1.114492416381836, "umap_1": 8.263710975646973, "region": "Rest Africa", "recipe": "Egyptian Yellow Lentil Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47717", "like": -1}, {"umap_0": 8.602962493896484, "umap_1": 8.177571296691895, "region": "Deutschland", "recipe": "Blancmanger (Chicken and Rice Casserole)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138725", "like": 0}, {"umap_0": -1.606046199798584, "umap_1": 6.272897720336914, "region": "Italian", "recipe": "Sunday Sauce With Meatballs, Sausages, and Pork Chops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123843", "like": 1}, {"umap_0": 10.657774925231934, "umap_1": 8.120484352111816, "region": "Canadian", "recipe": "Canadian Bacon Stack-Ups", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147941", "like": 0}, {"umap_0": -0.6598324775695801, "umap_1": 8.080796241760254, "region": "French", "recipe": "Cassoulet", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112343", "like": 0}, {"umap_0": 10.475963592529297, "umap_1": 5.699642658233643, "region": "French", "recipe": "Breadless French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115247", "like": 0}, {"umap_0": 0.818125307559967, "umap_1": 9.986652374267578, "region": "Northern Africa", "recipe": "Moroccan Flavored Pork Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50289", "like": 1}, {"umap_0": 1.3073502779006958, "umap_1": 6.508403301239014, "region": "Indian Subcontinent", "recipe": "Easy Microwave Potato and Corn Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64400", "like": 1}, {"umap_0": 0.4339883327484131, "umap_1": 7.10546875, "region": "Australian", "recipe": "Breadfruit and Lamb Liver", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76512", "like": 1}, {"umap_0": 10.413260459899902, "umap_1": 7.858121871948242, "region": "Middle Eastern", "recipe": "Omm 'Ali (Egyptian Bread and Butter Pudding)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47477", "like": 0}, {"umap_0": -0.6674445867538452, "umap_1": 5.123907566070557, "region": "US", "recipe": "Broccoli and Artichoke Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14820", "like": 1}, {"umap_0": -0.21166661381721497, "umap_1": 11.940535545349121, "region": "Indian Subcontinent", "recipe": "Indian Tomato Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4086", "like": 1}, {"umap_0": 2.3027584552764893, "umap_1": 10.956110954284668, "region": "Thai", "recipe": "Coconut, Lime and Chilli Baked Fish With Thai Potato Wedges", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59450", "like": 1}, {"umap_0": 10.223357200622559, "umap_1": 8.356406211853027, "region": "Korean", "recipe": "Lemon Sweet &amp; Sour Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69229", "like": 0}, {"umap_0": -1.7379339933395386, "umap_1": 5.8034749031066895, "region": "Greek", "recipe": "Joe's Tasty Bruschetta Pizza!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110324", "like": 1}, {"umap_0": 0.2126130759716034, "umap_1": 7.719091892242432, "region": "French", "recipe": "Les Supremes De Volaile Saint Sylvestre", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112943", "like": 0}, {"umap_0": 2.5439085960388184, "umap_1": 10.184285163879395, "region": "Chinese and Mongolian", "recipe": "Marinated Chinese Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53252", "like": 0}, {"umap_0": 1.618972659111023, "umap_1": 7.0421013832092285, "region": "US", "recipe": "Cranberry Brisket", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18716", "like": -1}, {"umap_0": 0.4893477261066437, "umap_1": 10.455350875854492, "region": "Mexican", "recipe": "Chipotle Tilapia Tacos with Mango-Cilantro Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6134", "like": 0}, {"umap_0": 2.007398843765259, "umap_1": 4.715290546417236, "region": "Mexican", "recipe": "Mexican-Style Beefy Tortellini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82398", "like": 1}, {"umap_0": -0.35907480120658875, "umap_1": 10.42431640625, "region": "Mexican", "recipe": "Mexican Cabbage Salad-Taco Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77801", "like": 0}, {"umap_0": 1.94753897190094, "umap_1": 11.03625202178955, "region": "Thai", "recipe": "Chicken Salad With Thai-Flavored Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3384", "like": 0}, {"umap_0": 1.6027430295944214, "umap_1": 6.762001991271973, "region": "US", "recipe": "Aunt Fanny's Squash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16132", "like": 0}, {"umap_0": -0.5982398390769958, "umap_1": 8.583219528198242, "region": "Canadian", "recipe": "Canadian Mary Pickford's University Avenue Potage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143813", "like": 0}, {"umap_0": 2.6169447898864746, "umap_1": 10.519927978515625, "region": "Japanese", "recipe": "Dr. Andrew Weil\u2019s Grilled or Broiled Salmon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71842", "like": 0}, {"umap_0": 11.396207809448242, "umap_1": 8.511942863464355, "region": "French", "recipe": "Almond Cream With Fruit", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113134", "like": 0}, {"umap_0": 10.781975746154785, "umap_1": 8.984734535217285, "region": "UK", "recipe": "Strawberry-Lemon Angel Food Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103951", "like": 0}, {"umap_0": 0.7907938957214355, "umap_1": 10.293313026428223, "region": "Middle Eastern", "recipe": "Egyptian Koshary Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47308", "like": 0}, {"umap_0": -1.2615807056427002, "umap_1": 8.718500137329102, "region": "Caribbean", "recipe": "Rabo Encendido(Cuban Oxtail Stew)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92674", "like": 1}, {"umap_0": 1.245664358139038, "umap_1": 4.96910285949707, "region": "Indian Subcontinent", "recipe": "Fuss-Free Biryani Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64086", "like": 0}, {"umap_0": -0.3872736990451813, "umap_1": 11.4719877243042, "region": "Middle Eastern", "recipe": "Spicy Dilly Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72524", "like": 0}, {"umap_0": -0.10169387608766556, "umap_1": 8.22164249420166, "region": "Canadian", "recipe": "Maple Ham Peachies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144208", "like": 0}, {"umap_0": 10.378604888916016, "umap_1": 6.253831386566162, "region": "UK", "recipe": "Mixed Nut Shortbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101649", "like": 0}, {"umap_0": -1.3679935932159424, "umap_1": 6.473010540008545, "region": "Italian", "recipe": "Spaghetti (With Mushroom Soup?!)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125094", "like": -1}, {"umap_0": 0.8709162473678589, "umap_1": 12.464061737060547, "region": "Indian Subcontinent", "recipe": "Pathani Chicken Biryani", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72390", "like": 0}, {"umap_0": 2.7814364433288574, "umap_1": 9.63673210144043, "region": "Chinese and Mongolian", "recipe": "Szechuan Vegetable Stir-Fry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55902", "like": 0}, {"umap_0": -0.6416953206062317, "umap_1": 6.175542831420898, "region": "Canadian", "recipe": "Deluxe Mac and Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142968", "like": 1}, {"umap_0": 0.022139783948659897, "umap_1": 10.70101261138916, "region": "South American", "recipe": "Tequila-Marinated Beef Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97772", "like": 0}, {"umap_0": 10.462568283081055, "umap_1": 9.70547103881836, "region": "Mexican", "recipe": "Monkey Shake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81498", "like": 0}, {"umap_0": 0.9298517107963562, "umap_1": 12.21575927734375, "region": "Indian Subcontinent", "recipe": "Coconut Beet Samosas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67831", "like": 1}, {"umap_0": -0.6186256408691406, "umap_1": 7.069271087646484, "region": "Italian", "recipe": "Pollo Capri - Adapted from Bella's Italian Cafe, Tampa, Florida", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123308", "like": 1}, {"umap_0": 0.7549301385879517, "umap_1": 7.419032096862793, "region": "Italian", "recipe": "Light Veal Marsala", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127832", "like": -1}, {"umap_0": 10.337684631347656, "umap_1": 7.999570846557617, "region": "Rest Africa", "recipe": "Sauteed Bananas With Caramel Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48955", "like": 0}, {"umap_0": 0.7815414071083069, "umap_1": 9.64523696899414, "region": "Mexican", "recipe": "Mexican Black Bean Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81016", "like": 0}, {"umap_0": -0.6640657186508179, "umap_1": 5.883609294891357, "region": "Indian Subcontinent", "recipe": "Low Carb Meatballs a La Parmigiana", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66347", "like": 1}, {"umap_0": 1.2964386940002441, "umap_1": 5.1462273597717285, "region": "UK", "recipe": "Stilton and Grape Avocados Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/104465", "like": 1}, {"umap_0": -0.7768697738647461, "umap_1": 6.05380916595459, "region": "Italian", "recipe": "Pesto Filled Spiral Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123793", "like": 1}, {"umap_0": 3.0572879314422607, "umap_1": 7.49185848236084, "region": "French", "recipe": "French Fried Onions (Gluten-Free)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111785", "like": 0}, {"umap_0": -0.3980916142463684, "umap_1": 12.852380752563477, "region": "Caribbean", "recipe": "Jammin' Jamaican Spice Blend", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92043", "like": 0}, {"umap_0": 0.5679068565368652, "umap_1": 7.683577060699463, "region": "US", "recipe": "Creamy Sweet Potato Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15184", "like": 0}, {"umap_0": 10.96013069152832, "umap_1": 7.587233543395996, "region": "UK", "recipe": "English Walnut Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103763", "like": 0}, {"umap_0": -0.47361117601394653, "umap_1": 8.561030387878418, "region": "UK", "recipe": "Easy Chicken Stock", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102053", "like": 0}, {"umap_0": 8.828392028808594, "umap_1": 8.148130416870117, "region": "Middle Eastern", "recipe": "Iranian Saffron Rice Pudding (Sholeh Zard) (Dairy Free &amp; Glu", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60933", "like": 0}, {"umap_0": 0.44980233907699585, "umap_1": 9.950838088989258, "region": "Mexican", "recipe": "Pork and Apple Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84247", "like": 1}, {"umap_0": -0.2053876519203186, "umap_1": 6.008750915527344, "region": "South American", "recipe": "Tessa Kiros Beef and Potato Croquettes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95917", "like": 1}, {"umap_0": 2.67214298248291, "umap_1": 10.572834968566895, "region": "Thai", "recipe": "Thai Shrimp and Broccoli Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58511", "like": 0}, {"umap_0": -0.1595761477947235, "umap_1": 5.343988418579102, "region": "Mexican", "recipe": "Baked Chorizo Cheesy Mac", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84522", "like": 1}, {"umap_0": -1.598894476890564, "umap_1": 11.995153427124023, "region": "Mexican", "recipe": "Chipotle Mexican Grill Guacamole Copycat", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87336", "like": 1}, {"umap_0": 0.052441176027059555, "umap_1": 5.2800493240356445, "region": "French", "recipe": "Creamy French Onion Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113902", "like": 1}, {"umap_0": 1.6097749471664429, "umap_1": 7.684686183929443, "region": "Eastern European", "recipe": "Beef Stroganoff with Noodles", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8597", "like": -1}, {"umap_0": 0.5770699977874756, "umap_1": 6.024519920349121, "region": "French", "recipe": "French Onion Biscuits", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114084", "like": 0}, {"umap_0": 9.463890075683594, "umap_1": 9.363100051879883, "region": "Italian", "recipe": "Granita De Limone (Sicilian Lemon Ice)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120780", "like": 0}, {"umap_0": -0.5556496977806091, "umap_1": 9.402655601501465, "region": "Chinese and Mongolian", "recipe": "White Beans &amp; Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54234", "like": 0}, {"umap_0": -1.6857001781463623, "umap_1": 7.512906074523926, "region": "Deutschland", "recipe": "German Salat Fix", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137251", "like": 1}, {"umap_0": 1.2722601890563965, "umap_1": 10.944067001342773, "region": "Australian", "recipe": "Cranberry Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73556", "like": 0}, {"umap_0": 0.4704318940639496, "umap_1": 4.850879669189453, "region": "South American", "recipe": "Beef and Blue Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96044", "like": 1}, {"umap_0": 11.64326000213623, "umap_1": 7.896361351013184, "region": "Mexican", "recipe": "Mexican Coffee Balls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82205", "like": 0}, {"umap_0": -0.5871949791908264, "umap_1": 4.973294258117676, "region": "Italian", "recipe": "Crispy Eggplant with Spicy Tomato Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12376", "like": 1}, {"umap_0": 10.561944007873535, "umap_1": 8.81438159942627, "region": "Italian", "recipe": "Limoncello Panna Cotta With Black Cherry Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121274", "like": 0}, {"umap_0": 10.184091567993164, "umap_1": 7.238198757171631, "region": "Italian", "recipe": "Michele's Spiced Almond Biscotti", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122057", "like": 0}, {"umap_0": 2.5322582721710205, "umap_1": 10.279123306274414, "region": "Thai", "recipe": "Mango Summer Rolls &amp; Thai Dipping Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58648", "like": 0}, {"umap_0": 0.7151965498924255, "umap_1": 9.91037654876709, "region": "Mexican", "recipe": "Pueblo Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82618", "like": 0}, {"umap_0": 3.6300299167633057, "umap_1": 9.259065628051758, "region": "Japanese", "recipe": "Japanese Vegetable Pancakes (Okonomiyaki)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71890", "like": -1}, {"umap_0": 1.4382530450820923, "umap_1": 7.865445613861084, "region": "Chinese and Mongolian", "recipe": "Oven-Cooked Rice Pilaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/54403", "like": 0}, {"umap_0": -0.5371537208557129, "umap_1": 4.815740585327148, "region": "South American", "recipe": "Cheesy Beef Noodle Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99597", "like": 1}, {"umap_0": 10.231728553771973, "umap_1": 7.1652445793151855, "region": "Rest Africa", "recipe": "Khobz Tunis - Algerian Helouwa - My Family Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47972", "like": 0}, {"umap_0": 0.24582378566265106, "umap_1": 10.788599967956543, "region": "Mexican", "recipe": "Tacos al Pastor", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7006", "like": 0}, {"umap_0": 1.425007700920105, "umap_1": 8.345734596252441, "region": "South American", "recipe": "Believe It or Not Roast Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98284", "like": -1}, {"umap_0": 0.8389599323272705, "umap_1": 8.373648643493652, "region": "South American", "recipe": "Broedlaewend (Romanian Beef Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96275", "like": 1}, {"umap_0": -2.4443905353546143, "umap_1": 9.003817558288574, "region": "Italian", "recipe": "Italian Orange Slices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125157", "like": 1}, {"umap_0": 9.854039192199707, "umap_1": 7.419904708862305, "region": "Chinese and Mongolian", "recipe": "(Leftover) Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53456", "like": 0}, {"umap_0": 2.2491025924682617, "umap_1": 8.778008460998535, "region": "Belgian", "recipe": "Belgian Mayonnaise to Serve With Frites", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106602", "like": 0}, {"umap_0": 0.44109848141670227, "umap_1": 4.630356788635254, "region": "US", "recipe": "Frankly Super Supper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18001", "like": 1}, {"umap_0": -0.32183316349983215, "umap_1": 11.799098014831543, "region": "US", "recipe": "Chef John's Buttermilk Fried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14487", "like": 0}, {"umap_0": -1.290636658668518, "umap_1": 8.649124145507812, "region": "Rest Africa", "recipe": "Safari Steak for the Hunter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48084", "like": 0}, {"umap_0": -0.6573124527931213, "umap_1": 10.916947364807129, "region": "Northern Africa", "recipe": "Moroccan Spiced Preserved Lemons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49826", "like": 0}, {"umap_0": -0.929466187953949, "umap_1": 4.51997184753418, "region": "Italian", "recipe": "Italian Hoagie Delight", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132631", "like": 1}, {"umap_0": -0.9419817328453064, "umap_1": 8.130386352539062, "region": "Indian Subcontinent", "recipe": "Caramelized Cumin-Roasted Carrots", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63951", "like": 1}, {"umap_0": 0.8741731643676758, "umap_1": 11.66552734375, "region": "Middle Eastern", "recipe": "A Baqliyya of Zirlab's", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61589", "like": 0}, {"umap_0": -2.3510994911193848, "umap_1": 7.491191387176514, "region": "Italian", "recipe": "\"Good Carb\" Fettuccine Witih Fresh Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119308", "like": 1}, {"umap_0": 9.6798734664917, "umap_1": 6.823861598968506, "region": "Greek", "recipe": "Greek Yogurt Strawberry Banana Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110060", "like": 0}, {"umap_0": -0.6030805706977844, "umap_1": 7.98883581161499, "region": "French", "recipe": "Burgundy Pork Tenderloin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113354", "like": 0}, {"umap_0": 11.5413236618042, "umap_1": 6.94635534286499, "region": "Mexican", "recipe": "Ninfa's Flan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80019", "like": 1}, {"umap_0": 10.419769287109375, "umap_1": 9.615630149841309, "region": "South American", "recipe": "My Mom's Cranberry Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100008", "like": 0}, {"umap_0": 0.20882202684879303, "umap_1": 8.077316284179688, "region": "South American", "recipe": "Heinz 57 Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98809", "like": -1}, {"umap_0": -1.8444103002548218, "umap_1": 7.516116619110107, "region": "Italian", "recipe": "Melon, Prosciutto &amp; Mozzarella Salad With Mint Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127989", "like": 1}, {"umap_0": 0.8042580485343933, "umap_1": 6.945413589477539, "region": "Irish", "recipe": "Colcannon(Mashed Potato &amp; Cabbage)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134644", "like": 0}, {"umap_0": 10.007749557495117, "umap_1": 5.933751583099365, "region": "Australian", "recipe": "Spelt Soda Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76365", "like": 0}, {"umap_0": 0.564936637878418, "umap_1": 6.081768035888672, "region": "Canadian", "recipe": "Microwave Poached Eggs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147710", "like": 1}, {"umap_0": 11.158587455749512, "umap_1": 7.070868015289307, "region": "French", "recipe": "Peach Cobbler  French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116263", "like": 0}, {"umap_0": -0.11201471090316772, "umap_1": 11.078572273254395, "region": "US", "recipe": "Cranberry Pork Chops I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18696", "like": 0}, {"umap_0": 1.7187167406082153, "umap_1": 7.4296555519104, "region": "Deutschland", "recipe": "Helen's Creamy Garlic String - Green Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137904", "like": 0}, {"umap_0": -2.461521625518799, "umap_1": 8.884740829467773, "region": "Italian", "recipe": "Chef John's Bagna Cauda", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11365", "like": 1}, {"umap_0": 2.537705183029175, "umap_1": 9.78603458404541, "region": "Chinese and Mongolian", "recipe": "Pei Wei Asian Diner Shanghai Pork Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56756", "like": 0}, {"umap_0": 0.5408082604408264, "umap_1": 12.35210132598877, "region": "Indian Subcontinent", "recipe": "Banana Dahl", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65926", "like": 0}, {"umap_0": 1.1257686614990234, "umap_1": 8.981525421142578, "region": "Middle Eastern", "recipe": "Makhlouta 2 - Vegetarian Hearty Lebanese Lentil &amp; Bean Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63063", "like": 0}, {"umap_0": -0.8881869316101074, "umap_1": 4.514981746673584, "region": "Italian", "recipe": "Bob's Awesome Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11872", "like": 1}, {"umap_0": 0.3362944424152374, "umap_1": 7.0383429527282715, "region": "Irish", "recipe": "Corned Beef and Cabbage for 2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136261", "like": -1}, {"umap_0": 1.5462255477905273, "umap_1": 11.949945449829102, "region": "Indian Subcontinent", "recipe": "Sweet and Sour Fish Khatti Meethi Macchi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64984", "like": 0}, {"umap_0": -0.9165903925895691, "umap_1": 8.658659934997559, "region": "Scandinavian", "recipe": "Karrysalat - Danish Curried Macaroni and Herring Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101212", "like": 1}, {"umap_0": 2.67948317527771, "umap_1": 7.135268211364746, "region": "Italian", "recipe": "Potato, Parmesan and Anchovy Focaccia", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132249", "like": 1}, {"umap_0": -0.3302302956581116, "umap_1": 7.245589733123779, "region": "Mexican", "recipe": "Sopa De Calabacin Y Guajolote (Zucchini and Turkey Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82599", "like": 0}, {"umap_0": 10.758835792541504, "umap_1": 6.664599418640137, "region": "Canadian", "recipe": "A Different Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147627", "like": 0}, {"umap_0": 10.548820495605469, "umap_1": 7.828482151031494, "region": "French", "recipe": "Apple Cinnamon French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112177", "like": 0}, {"umap_0": 1.1214299201965332, "umap_1": 11.835747718811035, "region": "Indian Subcontinent", "recipe": "Restaurant Style Indian Butter Chicken (Chicken Makhani)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64597", "like": 1}, {"umap_0": -0.49632641673088074, "umap_1": 10.749210357666016, "region": "Middle Eastern", "recipe": "Chicken Shawarma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63045", "like": 1}, {"umap_0": -0.6656883358955383, "umap_1": 5.413561820983887, "region": "Italian", "recipe": "Baked Mashed Potatoes Italian", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119844", "like": 1}, {"umap_0": -0.8973937630653381, "umap_1": 8.444534301757812, "region": "Caribbean", "recipe": "Caribbean Potato and Eggs Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90588", "like": 0}, {"umap_0": -2.2734081745147705, "umap_1": 8.599804878234863, "region": "Italian", "recipe": "Spicy Linguine With Clams", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125636", "like": 1}, {"umap_0": 9.773694038391113, "umap_1": 6.301175117492676, "region": "Canadian", "recipe": "Parkin Drops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143090", "like": 0}, {"umap_0": 3.7195725440979004, "umap_1": 9.354264259338379, "region": "Japanese", "recipe": "Japanese Wok-Fried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71206", "like": 0}, {"umap_0": -2.3354671001434326, "umap_1": 8.990545272827148, "region": "Italian", "recipe": "Trota Alle Erbette (Trout with Herbs)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10512", "like": 1}, {"umap_0": 3.0256967544555664, "umap_1": 9.340682983398438, "region": "Chinese and Mongolian", "recipe": "Beef and Mushroom Stir-Fry Korean Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56776", "like": -1}, {"umap_0": 0.21639230847358704, "umap_1": 5.183401107788086, "region": "UK", "recipe": "Collier\u2019s Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102743", "like": 1}, {"umap_0": -1.54658842086792, "umap_1": 9.06863784790039, "region": "US", "recipe": "Cowboy Kale", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17345", "like": 1}, {"umap_0": 0.5320228934288025, "umap_1": 8.11629867553711, "region": "South American", "recipe": "Beef &amp; Portabellas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99143", "like": 1}, {"umap_0": 2.7491660118103027, "umap_1": 10.60092544555664, "region": "Southeast Asian", "recipe": "Spicy Peanut Dipping Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62230", "like": 0}, {"umap_0": -0.06501508504152298, "umap_1": 10.493794441223145, "region": "Deutschland", "recipe": "German Geraucherte Forelle Nach Art Des Hauses (Smoked Trout)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138182", "like": 0}, {"umap_0": 4.006468772888184, "umap_1": 10.27139949798584, "region": "Japanese", "recipe": "Spinach with Sesame Dressing (from Japanese Home Cooking)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71731", "like": -1}, {"umap_0": 11.346846580505371, "umap_1": 6.1185808181762695, "region": "Scandinavian", "recipe": "Copy-Cat Hoito Finnish Pancakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142221", "like": 0}, {"umap_0": 0.5280241370201111, "umap_1": 7.505107402801514, "region": "South American", "recipe": "Cornbread Beef Liver Patties", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95957", "like": 0}, {"umap_0": 2.9683446884155273, "umap_1": 9.626852035522461, "region": "Thai", "recipe": "Thai-style Salmon Potato", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59548", "like": -1}, {"umap_0": -2.0645203590393066, "umap_1": 8.43727970123291, "region": "French", "recipe": "Salade Gruy\u00e8re De Comt\u00e8 Pour Deux", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111594", "like": 1}, {"umap_0": 11.384322166442871, "umap_1": 7.0832085609436035, "region": "Australian", "recipe": "Malted Nightcap", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74725", "like": 0}, {"umap_0": 0.9112706184387207, "umap_1": 12.757550239562988, "region": "Indian Subcontinent", "recipe": "Tofu &amp; Potato Stuffed Bell Peppers in a Spicy Indian Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65998", "like": 0}, {"umap_0": -0.4696824550628662, "umap_1": 12.515528678894043, "region": "Caribbean", "recipe": "Caribbean Jerk Seasoning &amp; Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/90998", "like": 0}, {"umap_0": -2.077889919281006, "umap_1": 8.809181213378906, "region": "Canadian", "recipe": "Stuffed Zucchini - Chilled", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142717", "like": 0}, {"umap_0": 2.6803553104400635, "umap_1": 7.0498199462890625, "region": "UK", "recipe": "Potato Skins and Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101388", "like": 0}, {"umap_0": 9.966638565063477, "umap_1": 9.367307662963867, "region": "Irish", "recipe": "Caribbean Ginger Beer", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135837", "like": 0}, {"umap_0": -1.4631023406982422, "umap_1": 6.109340190887451, "region": "Italian", "recipe": "Tomato Cream Pasta.", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/126644", "like": 1}, {"umap_0": 0.9024763703346252, "umap_1": 11.940484046936035, "region": "Indian Subcontinent", "recipe": "Takkali Payam Opma", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66263", "like": 0}, {"umap_0": 1.197152018547058, "umap_1": 7.201746463775635, "region": "South American", "recipe": "Oven (Or Slow Cooker) Barbecued Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99428", "like": -1}, {"umap_0": 10.89644718170166, "umap_1": 6.839352607727051, "region": "Chinese and Mongolian", "recipe": "Chinese Chip Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53135", "like": 0}, {"umap_0": -0.5381951928138733, "umap_1": 5.4431586265563965, "region": "Rest Africa", "recipe": "Watermelon Salad With Celery-Nut Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48565", "like": 1}, {"umap_0": 0.26075559854507446, "umap_1": 5.338015556335449, "region": "Canadian", "recipe": "Best 5 Minutes Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147855", "like": -1}, {"umap_0": 10.12112045288086, "umap_1": 9.393574714660645, "region": "UK", "recipe": "Fluffy Orange Syllabub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102368", "like": 0}, {"umap_0": 1.4043314456939697, "umap_1": 7.761448860168457, "region": "Greek", "recipe": "Avegolemono (Greek Lemon Chicken Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/109512", "like": 0}, {"umap_0": 1.7731153964996338, "umap_1": 5.481025695800781, "region": "Mexican", "recipe": "Turkey Enchiladas---A Little Lighter Version", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86341", "like": 1}, {"umap_0": 1.898542881011963, "umap_1": 11.419892311096191, "region": "Thai", "recipe": "Larb Gai - Thai Chicken Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58776", "like": 0}, {"umap_0": 10.133990287780762, "umap_1": 5.9880876541137695, "region": "French", "recipe": "Cinnamon Espresso Cream Sables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114568", "like": 0}, {"umap_0": -0.7373630404472351, "umap_1": 7.718879222869873, "region": "Italian", "recipe": "Pear Parmesan Risotto With Hazelnuts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120373", "like": 1}, {"umap_0": 8.82620906829834, "umap_1": 7.235421657562256, "region": "Deutschland", "recipe": "Mohnstriezel -- Poppy Seed Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138842", "like": 0}, {"umap_0": -0.813057005405426, "umap_1": 8.23116683959961, "region": "French", "recipe": "Grilled Chicken Dijonnaise", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116372", "like": 0}, {"umap_0": 0.09704425930976868, "umap_1": 7.571213722229004, "region": "French", "recipe": "Lila's Favourite Chicken With Creamy Mushroom Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114057", "like": 1}, {"umap_0": 1.2712476253509521, "umap_1": 9.035685539245605, "region": "Irish", "recipe": "Glazed Corned Beef  (Silver Palate)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136167", "like": -1}, {"umap_0": 3.203937530517578, "umap_1": 9.78446102142334, "region": "US", "recipe": "Cedar Planked Salmon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17247", "like": 0}, {"umap_0": -0.41177719831466675, "umap_1": 13.591712951660156, "region": "Australian", "recipe": "HOMEMADE Holiday  Seasoning and SPICE", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73001", "like": 0}, {"umap_0": 8.251080513000488, "umap_1": 6.563953876495361, "region": "Deutschland", "recipe": "Bavarian Pretzels- Pretzel Rolls (Laugenbrezeln)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137458", "like": 0}, {"umap_0": 0.8185259699821472, "umap_1": 8.256429672241211, "region": "South American", "recipe": "Onion-Smothered Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99472", "like": 1}, {"umap_0": 1.339928388595581, "umap_1": 9.5243501663208, "region": "Eastern European", "recipe": "Beef &amp; Chickpea Paprikash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105495", "like": -1}, {"umap_0": 10.011178016662598, "umap_1": 6.347383499145508, "region": "Italian", "recipe": "Italian Chocolate Chip Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/11230", "like": 0}, {"umap_0": 10.13790225982666, "umap_1": 8.951178550720215, "region": "French", "recipe": "Honey-Soaked Goat Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113255", "like": 1}, {"umap_0": 0.8377261161804199, "umap_1": 10.998503684997559, "region": "Chinese and Mongolian", "recipe": "Ginger Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52851", "like": 0}, {"umap_0": 10.175458908081055, "umap_1": 7.586801052093506, "region": "Middle Eastern", "recipe": "Gluten Free Mandarin &amp; Almond Syrup Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60470", "like": 0}, {"umap_0": 9.692200660705566, "umap_1": 8.407427787780762, "region": "Indian Subcontinent", "recipe": "Banana Burfi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67407", "like": 0}, {"umap_0": -2.0898938179016113, "umap_1": 9.530020713806152, "region": "Italian", "recipe": "Italian Barbecued Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123012", "like": 1}, {"umap_0": -1.5693659782409668, "umap_1": 9.3914213180542, "region": "Deutschland", "recipe": "Bratwurst-Potato Skillet Dinner", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137369", "like": 1}, {"umap_0": -1.116334080696106, "umap_1": 9.287352561950684, "region": "UK", "recipe": "English Mustard", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102369", "like": 0}, {"umap_0": -0.8204387426376343, "umap_1": 10.038043022155762, "region": "Greek", "recipe": "Iman Bayaldi (Greek Eggplant Layers)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110275", "like": 1}, {"umap_0": 0.38013556599617004, "umap_1": 9.571761131286621, "region": "Mexican", "recipe": "Becky's Gourmet Southwestern Taco Salad or Nachos With Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82723", "like": 1}, {"umap_0": 10.460600852966309, "umap_1": 5.786361217498779, "region": "Belgian", "recipe": "Oma's Boterkoek (Dutch Buttercake)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134046", "like": 0}, {"umap_0": 10.43691635131836, "umap_1": 9.140484809875488, "region": "Australian", "recipe": "Tangy Citrus Pots", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73504", "like": 0}, {"umap_0": -1.8791556358337402, "umap_1": 9.904389381408691, "region": "Rest Africa", "recipe": "Arabian Pickled Onions", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47878", "like": 0}, {"umap_0": 0.5187867283821106, "umap_1": 12.717525482177734, "region": "Indian Subcontinent", "recipe": "Kheema Matar", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65666", "like": -1}, {"umap_0": 0.5754416584968567, "umap_1": 9.46481990814209, "region": "Caribbean", "recipe": "Fidel's Revolutionary Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91717", "like": -1}, {"umap_0": 3.005664110183716, "umap_1": 10.086755752563477, "region": "Korean", "recipe": "Pasta with Korean Sesame Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69271", "like": 0}, {"umap_0": -2.39272403717041, "umap_1": 7.180198669433594, "region": "Greek", "recipe": "Ww 5 Points - Greek Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110304", "like": 1}, {"umap_0": 0.7749611139297485, "umap_1": 7.597709655761719, "region": "Canadian", "recipe": "Quebec Meat Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144798", "like": 0}, {"umap_0": -1.7026762962341309, "umap_1": 9.14134407043457, "region": "Italian", "recipe": "Warm Green Bean Salad with Pine Nuts and Basil", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/119950", "like": 1}, {"umap_0": -0.7000232338905334, "umap_1": 6.579843997955322, "region": "Italian", "recipe": "Risotto Alla Milanese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125993", "like": 1}, {"umap_0": 7.717833042144775, "umap_1": 6.442302227020264, "region": "Eastern European", "recipe": "Grandma Reilkoff's Authentic Russian Blintze", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100378", "like": 0}, {"umap_0": 0.5022920966148376, "umap_1": 6.842819690704346, "region": "Eastern European", "recipe": "Polish Kapusta Kielbasa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133705", "like": 0}, {"umap_0": 1.6971757411956787, "umap_1": 11.688207626342773, "region": "Southeast Asian", "recipe": "Cambodian Fish Amok", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62694", "like": 0}, {"umap_0": -1.6330304145812988, "umap_1": 9.885078430175781, "region": "Rest Africa", "recipe": "Rice Salad With Mint", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48392", "like": 0}, {"umap_0": -0.5844018459320068, "umap_1": 11.074140548706055, "region": "Caribbean", "recipe": "Bahama Shrimp Skewers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91517", "like": 1}, {"umap_0": 9.613204002380371, "umap_1": 8.623011589050293, "region": "Middle Eastern", "recipe": "Lokum ( Turkish Delight )", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61228", "like": 0}, {"umap_0": -0.7855482697486877, "umap_1": 11.475994110107422, "region": "Spanish and Portuguese", "recipe": "Grilled Pork Threads", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141185", "like": 1}, {"umap_0": 10.437041282653809, "umap_1": 7.87003231048584, "region": "Australian", "recipe": "Nashi Pear Puds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73076", "like": 0}, {"umap_0": -0.5804792642593384, "umap_1": 7.825493335723877, "region": "Italian", "recipe": "Veal Scaloppine Saltimbocca", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130084", "like": 1}, {"umap_0": 0.35925719141960144, "umap_1": 8.358013153076172, "region": "Irish", "recipe": "Irish Stew With Guinness", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134570", "like": 1}, {"umap_0": 0.021180037409067154, "umap_1": 10.130359649658203, "region": "Mexican", "recipe": "Breakfast Egg and Avocado Tostada", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84551", "like": 1}, {"umap_0": 0.27298417687416077, "umap_1": 8.719152450561523, "region": "Canadian", "recipe": "Pork Chops in Mushroom Soup (Aka Pork Chop Dinner)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145507", "like": -1}, {"umap_0": 0.6827617287635803, "umap_1": 9.6076078414917, "region": "Mexican", "recipe": "Chili Con Carne With Veggies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79866", "like": 1}, {"umap_0": 2.017608880996704, "umap_1": 5.731778621673584, "region": "Mexican", "recipe": "Layered Mexican Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/88539", "like": 1}, {"umap_0": 1.4073379039764404, "umap_1": 7.631011486053467, "region": "Scandinavian", "recipe": "Microwave Swedish Meatballs for 2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106383", "like": -1}, {"umap_0": -0.4583122432231903, "umap_1": 8.868474960327148, "region": "UK", "recipe": "English Spiced Beef", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103982", "like": -1}, {"umap_0": -1.6112645864486694, "umap_1": 12.072936058044434, "region": "Mexican", "recipe": "Tostitos Rica Mango Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/81432", "like": 0}, {"umap_0": 11.519378662109375, "umap_1": 7.874290466308594, "region": "French", "recipe": "Blintz Stuffed French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112924", "like": 1}, {"umap_0": 0.22269786894321442, "umap_1": 6.97373104095459, "region": "Canadian", "recipe": "Cauliflower With Mushroom Cheese Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142947", "like": 1}, {"umap_0": 9.784527778625488, "umap_1": 6.597893238067627, "region": "UK", "recipe": "Chocolate Surprise Cupcakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8946", "like": 1}, {"umap_0": 9.102849960327148, "umap_1": 6.40867280960083, "region": "Canadian", "recipe": "Poppy Seed Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144384", "like": 0}, {"umap_0": -2.1560158729553223, "umap_1": 9.51371955871582, "region": "Canadian", "recipe": "Mustard Chive Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143588", "like": 1}, {"umap_0": 1.1569068431854248, "umap_1": 9.489564895629883, "region": "Italian", "recipe": "Italian Sausage Sweet #2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121288", "like": 0}, {"umap_0": 10.355172157287598, "umap_1": 8.300712585449219, "region": "Canadian", "recipe": "Maple Candied Sweet Potatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142712", "like": 0}, {"umap_0": 10.1763334274292, "umap_1": 5.847024917602539, "region": "Canadian", "recipe": "Maple Squares", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143156", "like": 0}, {"umap_0": 0.36425596475601196, "umap_1": 9.473384857177734, "region": "Spanish and Portuguese", "recipe": "Skillet Linguica With Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/118791", "like": 1}, {"umap_0": 7.525515556335449, "umap_1": 7.143021583557129, "region": "Italian", "recipe": "Homemade Pasta Sheets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130426", "like": 1}, {"umap_0": 1.1162711381912231, "umap_1": 8.146279335021973, "region": "South American", "recipe": "Vegetable Beef Soup (Crockpot)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94531", "like": -1}, {"umap_0": -1.3993396759033203, "umap_1": 5.5542683601379395, "region": "Canadian", "recipe": "Spinach Chicken Breast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145842", "like": 1}, {"umap_0": 1.387834906578064, "umap_1": 10.809419631958008, "region": "Scandinavian", "recipe": "Not so Universal Dipping Sauce (Iceland)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139836", "like": 0}, {"umap_0": -1.0863066911697388, "umap_1": 4.289551734924316, "region": "Italian", "recipe": "Pesto Snack Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122403", "like": 1}, {"umap_0": 10.452999114990234, "umap_1": 7.272473335266113, "region": "Scandinavian", "recipe": "Scandinavian Rice Pudding", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139310", "like": 0}, {"umap_0": -0.5004931688308716, "umap_1": 7.338395118713379, "region": "Eastern European", "recipe": "Catfish in Sauerkraut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105709", "like": 0}, {"umap_0": 10.01169490814209, "umap_1": 7.774265766143799, "region": "Indian Subcontinent", "recipe": "Baklava - An Indian Milk Sweet In Sugar Syrup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66702", "like": 0}, {"umap_0": 0.15904690325260162, "umap_1": 8.337121963500977, "region": "South American", "recipe": "Tender Braised Beef Short Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98623", "like": -1}, {"umap_0": -0.9758134484291077, "umap_1": 11.048715591430664, "region": "Middle Eastern", "recipe": "Turkish Marinade (Beef &amp; Lamb Kabobs)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70190", "like": 1}, {"umap_0": -1.8655471801757812, "umap_1": 7.9572038650512695, "region": "Indian Subcontinent", "recipe": "Romaine Salad with Grilled Vegetables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66478", "like": 1}, {"umap_0": 3.9817955493927, "umap_1": 9.785297393798828, "region": "Japanese", "recipe": "Inari Zushi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70458", "like": 0}, {"umap_0": 9.90712833404541, "umap_1": 7.746610641479492, "region": "Northern Africa", "recipe": "Bourtaka Muhallabieh (Orange Custard With Caramel)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50980", "like": 0}, {"umap_0": 0.7801458835601807, "umap_1": 11.89452075958252, "region": "Indian Subcontinent", "recipe": "Cottage Cheese (paneer) Cutlets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67039", "like": 1}, {"umap_0": 0.9591661691665649, "umap_1": 10.184574127197266, "region": "Spanish and Portuguese", "recipe": "Battered Grouper Bites With Jalapeno Tartar Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140789", "like": 0}, {"umap_0": 0.8904129862785339, "umap_1": 12.776399612426758, "region": "Indian Subcontinent", "recipe": "Easy Spicy Curried Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65218", "like": 0}, {"umap_0": -0.12338791787624359, "umap_1": 8.361936569213867, "region": "Eastern European", "recipe": "Lecho", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105764", "like": -1}, {"umap_0": 0.033274564892053604, "umap_1": 8.684553146362305, "region": "Central American", "recipe": "Martha Stewart's All American Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92963", "like": 1}, {"umap_0": 0.04600761458277702, "umap_1": 9.539316177368164, "region": "US", "recipe": "Cajun Roasted Pork Loin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15695", "like": 1}, {"umap_0": -1.588865041732788, "umap_1": 9.696952819824219, "region": "French", "recipe": "Ratatouille from the Ritz", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111645", "like": 1}, {"umap_0": -1.7829601764678955, "umap_1": 7.490671634674072, "region": "Italian", "recipe": "Healthy Eggplant Parmesan (No Frying Required)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12364", "like": 1}, {"umap_0": 0.31522831320762634, "umap_1": 5.601083278656006, "region": "Southeast Asian", "recipe": "Embutido", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3943", "like": 1}, {"umap_0": 10.354706764221191, "umap_1": 8.461705207824707, "region": "Spanish and Portuguese", "recipe": "Fig Jam Sweet Spanish Temptation!", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141665", "like": 0}, {"umap_0": -0.5451114773750305, "umap_1": 5.423852443695068, "region": "Italian", "recipe": "Olive Garden Baked Pasta Romana", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130632", "like": 1}, {"umap_0": 1.9135979413986206, "umap_1": 5.672440528869629, "region": "South American", "recipe": "Beef and Kasha Mexicana", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98509", "like": 1}, {"umap_0": -1.2617789506912231, "umap_1": 6.856207847595215, "region": "UK", "recipe": "Roasted Stuffed Onions - the Naked Chef, Jamie Oliver", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103997", "like": 1}, {"umap_0": 0.22626058757305145, "umap_1": 7.87052059173584, "region": "Italian", "recipe": "Chicken Breast With Lemon Sage Butter", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130529", "like": 0}, {"umap_0": 1.2216589450836182, "umap_1": 9.630709648132324, "region": "French", "recipe": "Crock Pot -Slow Cooker French Canadian Beans", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112915", "like": 0}, {"umap_0": -0.8544538021087646, "umap_1": 8.88943099975586, "region": "Eastern European", "recipe": "Walleye Hungarian Style (S\u00fcll\u00f6szelet Magyarosan)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105622", "like": 0}, {"umap_0": 3.954008102416992, "umap_1": 9.858881950378418, "region": "Japanese", "recipe": "Sweet Potatoes Simmered With Hijiki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/71093", "like": 0}, {"umap_0": 1.4343724250793457, "umap_1": 7.680675983428955, "region": "Scandinavian", "recipe": "Swedish Spareribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/106075", "like": -1}, {"umap_0": 0.7790343165397644, "umap_1": 9.638144493103027, "region": "US", "recipe": "Texas New Mexico Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18626", "like": -1}, {"umap_0": 1.2660120725631714, "umap_1": 10.12723445892334, "region": "US", "recipe": "Southern-Style Buttermilk Fried Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16229", "like": 0}, {"umap_0": 1.881528377532959, "umap_1": 7.698709011077881, "region": "Canadian", "recipe": "Porcupine Balls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144062", "like": -1}, {"umap_0": -0.9288707375526428, "umap_1": 11.16751480102539, "region": "Spanish and Portuguese", "recipe": "Sofrito sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141155", "like": 1}, {"umap_0": -1.4165360927581787, "umap_1": 6.273860454559326, "region": "Italian", "recipe": "Sinatra's Meatball Sandwich", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125423", "like": 1}, {"umap_0": 0.9488803148269653, "umap_1": 11.570879936218262, "region": "Indian Subcontinent", "recipe": "Biriyani", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4127", "like": 1}, {"umap_0": 8.909699440002441, "umap_1": 5.620061874389648, "region": "Deutschland", "recipe": "How to Make Quark Cheese", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137255", "like": 0}, {"umap_0": 0.017823049798607826, "umap_1": 7.319319725036621, "region": "Middle Eastern", "recipe": "Lachmajou (Middle-Eastern Lamb Dish)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61611", "like": 1}, {"umap_0": 8.319412231445312, "umap_1": 7.170512676239014, "region": "French", "recipe": "Bread Machine Sesame French Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116408", "like": 0}, {"umap_0": -1.2656375169754028, "umap_1": 10.194904327392578, "region": "Italian", "recipe": "Carrot Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124794", "like": 0}, {"umap_0": 0.015035233460366726, "umap_1": 10.506610870361328, "region": "Middle Eastern", "recipe": "Stuffed Eggplants With Meat (Karniyarik)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70078", "like": -1}, {"umap_0": -1.7405555248260498, "umap_1": 6.960663318634033, "region": "Italian", "recipe": "Sausage &amp; Peppers Etc: 2 Quick  Meals", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127565", "like": 1}, {"umap_0": 9.68491268157959, "umap_1": 7.133563041687012, "region": "Mexican", "recipe": "Double Chocolate Cupcakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/87963", "like": 0}, {"umap_0": 0.7470725178718567, "umap_1": 7.653170585632324, "region": "South American", "recipe": "Magyar Beef Paprikash", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94614", "like": -1}, {"umap_0": 9.491979598999023, "umap_1": 8.344870567321777, "region": "UK", "recipe": "Coconut Ice - No Cook Easy Version", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103620", "like": 0}, {"umap_0": 10.71562671661377, "umap_1": 7.951128005981445, "region": "US", "recipe": "Maple Syrup Tarts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16410", "like": 0}, {"umap_0": 1.3340120315551758, "umap_1": 12.653389930725098, "region": "Indian Subcontinent", "recipe": "Chachchari ( Mixed Vegetables - Bengali)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66098", "like": -1}, {"umap_0": 0.12013180553913116, "umap_1": 6.423801422119141, "region": "Indian Subcontinent", "recipe": "Hunza Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59615", "like": 1}, {"umap_0": -1.8127301931381226, "umap_1": 9.250195503234863, "region": "Rest Africa", "recipe": "Moroccan Style Chickpea Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48159", "like": 1}, {"umap_0": -2.540347099304199, "umap_1": 8.002099990844727, "region": "Greek", "recipe": "Greek Inspired London Broil", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110617", "like": 1}, {"umap_0": -1.625109076499939, "umap_1": 5.669013500213623, "region": "Italian", "recipe": "Contadina\u00ae Italian Chili", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12005", "like": 1}, {"umap_0": 2.9198195934295654, "umap_1": 9.796874046325684, "region": "Chinese and Mongolian", "recipe": "Light Chinese Chicken Salad With Hoisin Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53172", "like": 0}, {"umap_0": 0.1132591962814331, "umap_1": 5.34034538269043, "region": "French", "recipe": "Camembert Parcel With Rhubarb and Cranberry Coulis", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/111655", "like": 1}, {"umap_0": 11.730826377868652, "umap_1": 7.70757532119751, "region": "Italian", "recipe": "A New Yorker's Real Italian Cheesecake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120660", "like": 1}, {"umap_0": -0.03750184178352356, "umap_1": 4.46916389465332, "region": "US", "recipe": "Bethany's Frito\u00ae Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17324", "like": 1}, {"umap_0": -1.0414122343063354, "umap_1": 5.6872148513793945, "region": "US", "recipe": "Penne with Italian Sausage, Mushrooms, and Pumpkin Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18734", "like": 1}, {"umap_0": -1.836037039756775, "umap_1": 8.983660697937012, "region": "Italian", "recipe": "Eggplant Relish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132233", "like": 1}, {"umap_0": 0.8373492360115051, "umap_1": 11.240327835083008, "region": "Mexican", "recipe": "Mexican Enchilada Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83038", "like": 0}, {"umap_0": 10.626126289367676, "umap_1": 10.718032836914062, "region": "Mexican", "recipe": "Prickly Pear Margaritas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82898", "like": 0}, {"umap_0": -2.404245376586914, "umap_1": 8.472664833068848, "region": "Italian", "recipe": "Prosciutto Wrapped Grilled Prawns", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132078", "like": 1}, {"umap_0": -0.475350558757782, "umap_1": 11.476449966430664, "region": "Middle Eastern", "recipe": "Orange Chicken Koresh Iranian", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61911", "like": 0}, {"umap_0": 1.1926896572113037, "umap_1": 8.951848983764648, "region": "South American", "recipe": "Beef Curry With Apples and Raisins", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/98853", "like": -1}, {"umap_0": -2.483853578567505, "umap_1": 7.934386730194092, "region": "Italian", "recipe": "Penne Tricolore", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120255", "like": 1}, {"umap_0": 0.7254524827003479, "umap_1": 8.167946815490723, "region": "Eastern European", "recipe": "Hungarian Meatball Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105639", "like": 1}, {"umap_0": 10.114264488220215, "umap_1": 9.311877250671387, "region": "Spanish and Portuguese", "recipe": "Compota De Peras (Pear Compote)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/140332", "like": 0}, {"umap_0": -0.5089557766914368, "umap_1": 9.376617431640625, "region": "Caribbean", "recipe": "Jamaican Rice and Peas (Baked)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91614", "like": 0}, {"umap_0": 1.7959222793579102, "umap_1": 10.138152122497559, "region": "South American", "recipe": "My Beef Marinade and Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94151", "like": -1}, {"umap_0": 1.5806632041931152, "umap_1": 11.400876998901367, "region": "Chinese and Mongolian", "recipe": "Elegant Shrimp on Chinese Spoons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/52172", "like": 0}, {"umap_0": 10.686570167541504, "umap_1": 10.334814071655273, "region": "Irish", "recipe": "Irish Spring", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134755", "like": 0}, {"umap_0": 9.512006759643555, "umap_1": 6.218499183654785, "region": "Irish", "recipe": "Traditional Irish Brown Soda Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134961", "like": 0}, {"umap_0": -1.5781458616256714, "umap_1": 7.750732898712158, "region": "Italian", "recipe": "Barbecued Italian Sausage and Bean Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/130397", "like": 1}, {"umap_0": 0.4757559597492218, "umap_1": 12.604691505432129, "region": "Indian Subcontinent", "recipe": "Aloo Gobi", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64579", "like": 0}, {"umap_0": -0.22424258291721344, "umap_1": 11.056602478027344, "region": "Canadian", "recipe": "Canadian Day Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/144191", "like": 0}, {"umap_0": -0.17222923040390015, "umap_1": 9.497633934020996, "region": "Caribbean", "recipe": "Colorful Caribbean Sweet Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7986", "like": 0}, {"umap_0": -0.24111458659172058, "umap_1": 10.542080879211426, "region": "Rest Africa", "recipe": "Moroccan Chickpea Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48060", "like": 1}, {"umap_0": 11.142991065979004, "umap_1": 8.3550443649292, "region": "Mexican", "recipe": "Mexican Trifle", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84027", "like": 0}, {"umap_0": 9.90172004699707, "umap_1": 6.256960391998291, "region": "Canadian", "recipe": "Date Cake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146324", "like": 0}, {"umap_0": 9.844709396362305, "umap_1": 9.310951232910156, "region": "Deutschland", "recipe": "Hippocras (Non-Alcoholic)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138492", "like": 0}, {"umap_0": 1.7144811153411865, "umap_1": 7.089815139770508, "region": "South American", "recipe": "Ground Beef &amp; Vegetable Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94449", "like": -1}, {"umap_0": 10.07182502746582, "umap_1": 8.341670036315918, "region": "Caribbean", "recipe": "Coquito (Puerto Rican Eggnog)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92177", "like": 0}, {"umap_0": 0.5005300045013428, "umap_1": 12.809632301330566, "region": "Indian Subcontinent", "recipe": "Butternut Squash Coconut Curry", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4304", "like": 1}, {"umap_0": -0.4200623333454132, "umap_1": 5.337055206298828, "region": "Canadian", "recipe": "Seafood Stuffed Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148600", "like": 1}, {"umap_0": 11.81010913848877, "umap_1": 7.760477542877197, "region": "UK", "recipe": "Almond Roca (English Toffee)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102568", "like": 0}, {"umap_0": 9.832840919494629, "umap_1": 7.890415668487549, "region": "Australian", "recipe": "Pikelets With Cinnamon Apple Slices", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73346", "like": 0}, {"umap_0": -1.7133979797363281, "umap_1": 6.6384172439575195, "region": "French", "recipe": "Green Bean Salad With Radishes and Prosciutto", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114693", "like": 1}, {"umap_0": 0.749729335308075, "umap_1": 5.082089900970459, "region": "French", "recipe": "French Onion Potato Casserole #5FIX", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112868", "like": 1}, {"umap_0": -1.7115132808685303, "umap_1": 9.044116973876953, "region": "Australian", "recipe": "Bubble &amp; Squeak Patties With Tomato Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75956", "like": 1}, {"umap_0": 1.373085856437683, "umap_1": 9.704680442810059, "region": "Rest Africa", "recipe": "West African Peanut Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48277", "like": 0}, {"umap_0": 1.4935951232910156, "umap_1": 9.735342979431152, "region": "Chinese and Mongolian", "recipe": "Chinese Plum Chicken Cutlets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55948", "like": 0}, {"umap_0": -1.5976577997207642, "umap_1": 11.886408805847168, "region": "Mexican", "recipe": "Traditional Guacamole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84631", "like": 1}, {"umap_0": -1.9662648439407349, "umap_1": 8.760674476623535, "region": "Canadian", "recipe": "Braised Cannellini Beans With Onions and Arugula", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/146341", "like": 1}, {"umap_0": -0.6250852942466736, "umap_1": 6.701230525970459, "region": "French", "recipe": "Vincent Price Soupe \u00e0 L'oignon - Onion Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112895", "like": 1}, {"umap_0": -1.2206172943115234, "umap_1": 6.755017280578613, "region": "Italian", "recipe": "Beans, Greens, and Sausage Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131547", "like": 1}, {"umap_0": 1.0188153982162476, "umap_1": 5.508022785186768, "region": "Mexican", "recipe": "Mexi-Corn Lasagna by Dei Fratelli", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83021", "like": 1}, {"umap_0": -1.432366967201233, "umap_1": 9.989920616149902, "region": "Australian", "recipe": "My French Dressing\t2", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/75266", "like": 1}, {"umap_0": 0.4453372061252594, "umap_1": 9.032065391540527, "region": "US", "recipe": "Sausage and Shrimp Gumbo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14548", "like": 0}, {"umap_0": -1.2803369760513306, "umap_1": 10.824132919311523, "region": "South American", "recipe": "Citrus Chilean Sea Bass W- Pineapple Mango Salsa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93749", "like": 1}, {"umap_0": -2.219860792160034, "umap_1": 8.985672950744629, "region": "Spanish and Portuguese", "recipe": "Ensaladilla Tasca", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/141658", "like": 1}, {"umap_0": 10.401158332824707, "umap_1": 8.364279747009277, "region": "Indian Subcontinent", "recipe": "Shahi Tukre With Peaches and Melba", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67778", "like": 0}, {"umap_0": 3.145146369934082, "umap_1": 10.443859100341797, "region": "Korean", "recipe": "Bulgogi \"fire Meat\" With Leafy Green Vegetables and Sw", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69603", "like": -1}, {"umap_0": 0.1614750176668167, "umap_1": 12.734733581542969, "region": "Australian", "recipe": "Cauliflower With Sesame Seeds", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77711", "like": 0}, {"umap_0": -0.07009800523519516, "umap_1": 4.914054870605469, "region": "Scandinavian", "recipe": "Cream Cheese and Salmon Smorrebrod", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/139892", "like": 1}, {"umap_0": 10.870918273925781, "umap_1": 7.694238185882568, "region": "Deutschland", "recipe": "Frankfurt-Style Chocolate Pudding (Frankfurter Pudding)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/137396", "like": 0}, {"umap_0": 8.278300285339355, "umap_1": 7.1392903327941895, "region": "French", "recipe": "French Health Rolls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115897", "like": 0}, {"umap_0": 8.452439308166504, "umap_1": 7.092415809631348, "region": "Northern Africa", "recipe": "Moroccan Semolina Soup with Milk, Anise Seeds, and Honey", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2696", "like": 0}, {"umap_0": -1.2895585298538208, "umap_1": 11.647132873535156, "region": "Mexican", "recipe": "Guacamole: a Basic Recipe", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80240", "like": 1}, {"umap_0": 3.1594626903533936, "umap_1": 9.679027557373047, "region": "Chinese and Mongolian", "recipe": "Steamed Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55509", "like": 0}, {"umap_0": 9.92365550994873, "umap_1": 6.91334867477417, "region": "UK", "recipe": "Damp Jamaican Gingerbread (English Recipe)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102531", "like": 0}, {"umap_0": 0.14099453389644623, "umap_1": 11.822744369506836, "region": "Indian Subcontinent", "recipe": "Baked Samosas (From the New Revised Edition Moosewood Cookbook)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64126", "like": 0}, {"umap_0": 10.13015079498291, "umap_1": 9.819299697875977, "region": "Australian", "recipe": "Lemon, Lime and Bitters Crush", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77203", "like": 0}, {"umap_0": 0.7510555386543274, "umap_1": 13.056574821472168, "region": "Canadian", "recipe": "Montreal Steak Spice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147612", "like": 0}, {"umap_0": 0.2103879153728485, "umap_1": 7.211034297943115, "region": "Belgian", "recipe": "Bloemkool Met Versche Wurst - Cauliflower With Sausage", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134251", "like": 0}, {"umap_0": -1.9002048969268799, "umap_1": 6.193087100982666, "region": "South American", "recipe": "One-Pot Macaroni With Beef and Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/99967", "like": 1}, {"umap_0": -2.3340203762054443, "umap_1": 7.4687018394470215, "region": "Italian", "recipe": "Spaghetti with Roasted Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125098", "like": 1}, {"umap_0": 0.5050579309463501, "umap_1": 9.32116413116455, "region": "US", "recipe": "Super Easy Sloppy Joes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14574", "like": -1}, {"umap_0": 2.9883620738983154, "umap_1": 9.80622386932373, "region": "Chinese and Mongolian", "recipe": "Salmon with Ginger and Black Bean", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56437", "like": 0}, {"umap_0": 12.078207969665527, "umap_1": 4.6823248863220215, "region": "South American", "recipe": "Brazilian Cheese Puffs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93321", "like": 1}, {"umap_0": 0.8912193179130554, "umap_1": 7.144341468811035, "region": "Irish", "recipe": "Chicken Pot Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135735", "like": -1}, {"umap_0": -2.593534231185913, "umap_1": 7.8842878341674805, "region": "South American", "recipe": "Colorful Heirloom Bruschetta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100045", "like": 1}, {"umap_0": 10.912936210632324, "umap_1": 7.592945098876953, "region": "Eastern European", "recipe": "Kolache Raisin Cottage Cheese Filling", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142394", "like": 1}, {"umap_0": -0.21122437715530396, "umap_1": 7.659480571746826, "region": "French", "recipe": "Crock Pot Chicken With Mushrooms and Leeks (Low Carb)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/117175", "like": -1}, {"umap_0": -2.0493929386138916, "umap_1": 7.6341552734375, "region": "Italian", "recipe": "Goat Cheese and Artichoke Pizza", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/127680", "like": 1}, {"umap_0": -2.528930187225342, "umap_1": 7.982789039611816, "region": "Canadian", "recipe": "Broccoli and Feta Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147990", "like": 1}, {"umap_0": 11.443806648254395, "umap_1": 8.00767707824707, "region": "Scandinavian", "recipe": "Super Easy Fruit Danish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101226", "like": 1}, {"umap_0": -1.2563352584838867, "umap_1": 10.434575080871582, "region": "Northern Africa", "recipe": "Moroccan Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50376", "like": 1}, {"umap_0": -0.5071907043457031, "umap_1": 6.496397972106934, "region": "Canadian", "recipe": "Mediterranean Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/148402", "like": 1}, {"umap_0": 0.06120110675692558, "umap_1": 7.997251987457275, "region": "Japanese", "recipe": "Bacon Mushroom Burger", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70786", "like": -1}, {"umap_0": 0.6496170163154602, "umap_1": 9.024492263793945, "region": "Scandinavian", "recipe": "Danish Fruit Stuffed Pork Roast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101204", "like": 0}, {"umap_0": 1.8443557024002075, "umap_1": 10.97974681854248, "region": "Thai", "recipe": "Fiery Thai Beef Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59045", "like": 0}, {"umap_0": -1.7164322137832642, "umap_1": 8.89891529083252, "region": "Italian", "recipe": "Italian Marinated Vegetables", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/120351", "like": 1}, {"umap_0": 1.1932616233825684, "umap_1": 12.024711608886719, "region": "Indian Subcontinent", "recipe": "Tandoori Grilled Chicken", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4080", "like": 0}, {"umap_0": 10.764182090759277, "umap_1": 10.989800453186035, "region": "Deutschland", "recipe": "Spezi for Grown Ups", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/107433", "like": 0}, {"umap_0": 1.3912837505340576, "umap_1": 5.971057415008545, "region": "Mexican", "recipe": "Make-Ahead Mexican Lasagna", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/84255", "like": 1}, {"umap_0": 12.073341369628906, "umap_1": 8.225969314575195, "region": "Canadian", "recipe": "Peanut Butter Crunchers 2007", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145033", "like": 0}, {"umap_0": -1.5111716985702515, "umap_1": 9.223572731018066, "region": "Italian", "recipe": "Sicilian Tuna Steaks in Onion Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122877", "like": 1}, {"umap_0": 2.3548498153686523, "umap_1": 10.851354598999023, "region": "Central American", "recipe": "Honduran Coconut Seafood Soup (Sopa Catratcha De Mariscos Con Un", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/92710", "like": 1}, {"umap_0": 1.2219387292861938, "umap_1": 12.110342025756836, "region": "Indian Subcontinent", "recipe": "Sweet Corn Biryani !", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64376", "like": 0}, {"umap_0": -1.9334834814071655, "umap_1": 6.069193363189697, "region": "Italian", "recipe": "Italian Pasta Veggie Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10386", "like": -1}, {"umap_0": 2.841202974319458, "umap_1": 10.765084266662598, "region": "Southeast Asian", "recipe": "Chili Hot Sauce Recipe - Sriracha Style", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62072", "like": 0}, {"umap_0": 3.505317449569702, "umap_1": 9.640971183776855, "region": "Chinese and Mongolian", "recipe": "Easy Fried Rice for Two", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53288", "like": 0}, {"umap_0": 8.23872184753418, "umap_1": 6.855097770690918, "region": "Indian Subcontinent", "recipe": "Indian Naan II", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4617", "like": 0}, {"umap_0": -1.2070578336715698, "umap_1": 11.195632934570312, "region": "Mexican", "recipe": "The Fires of Hades - Hellish Relish", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7593", "like": 1}, {"umap_0": -1.3469449281692505, "umap_1": 8.995787620544434, "region": "Middle Eastern", "recipe": "Ghormeh Sabzi (Persian Herb Stew)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/2872", "like": -1}, {"umap_0": 11.741819381713867, "umap_1": 8.536918640136719, "region": "Irish", "recipe": "Paris Peppermint Topping", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134908", "like": 0}, {"umap_0": 3.7550482749938965, "umap_1": 10.522557258605957, "region": "Japanese", "recipe": "Japanese Potsticker Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/5001", "like": 0}, {"umap_0": -2.072842597961426, "umap_1": 8.675630569458008, "region": "Greek", "recipe": "Fat-Free Tzatziki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/108172", "like": 1}, {"umap_0": 1.7465989589691162, "umap_1": 10.382780075073242, "region": "Southeast Asian", "recipe": "Filipino Beef Kabobs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/63719", "like": 0}, {"umap_0": -1.587270975112915, "umap_1": 11.241811752319336, "region": "Middle Eastern", "recipe": "Quick &amp; Easy Tahini Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62823", "like": 0}, {"umap_0": 1.2939603328704834, "umap_1": 8.155804634094238, "region": "Canadian", "recipe": "Traditional Christmas Meat Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145128", "like": -1}, {"umap_0": 0.850754976272583, "umap_1": 9.974328994750977, "region": "Australian", "recipe": "Baby Back Ribs (Slow Cooker)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73702", "like": 0}, {"umap_0": 10.84080696105957, "umap_1": 6.465611934661865, "region": "Mexican", "recipe": "Chocolate Allspice Dessert Nachos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6289", "like": 0}, {"umap_0": 1.6355795860290527, "umap_1": 9.453885078430176, "region": "Thai", "recipe": "Chili-Mushroom Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58687", "like": 1}, {"umap_0": 11.0742769241333, "umap_1": 10.460848808288574, "region": "US", "recipe": "Vieux Carre", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16926", "like": 0}, {"umap_0": -0.011873293668031693, "umap_1": 5.467948913574219, "region": "South American", "recipe": "Cottage Cheesy Beef Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/97014", "like": 1}, {"umap_0": 0.707388162612915, "umap_1": 12.20598316192627, "region": "Indian Subcontinent", "recipe": "Slow Cooker Chicken Tikka Masala - Weight Watchers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65216", "like": 0}, {"umap_0": -2.0839784145355225, "umap_1": 9.767499923706055, "region": "Canadian", "recipe": "Grilled Lemon-Basil Halibut", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142844", "like": 1}, {"umap_0": -1.3928797245025635, "umap_1": 7.954107761383057, "region": "Italian", "recipe": "Baked Orecchiette With Pork Sugo", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128724", "like": 1}, {"umap_0": 0.37006673216819763, "umap_1": 11.647196769714355, "region": "Northern Africa", "recipe": "Fish Couscous W-Onion T'faya (ZWT-9 \u2013 Morocco)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/50404", "like": 1}, {"umap_0": -1.4682801961898804, "umap_1": 10.121813774108887, "region": "Mexican", "recipe": "Grilled Portabella Mushrooms and Veggie Fajitas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80416", "like": 1}, {"umap_0": -1.665097713470459, "umap_1": 9.799483299255371, "region": "Middle Eastern", "recipe": "Marinated Mushrooms", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60612", "like": -1}, {"umap_0": 7.448425769805908, "umap_1": 6.933340072631836, "region": "Australian", "recipe": "Wattleseed Pasta", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77472", "like": 1}, {"umap_0": 3.251059055328369, "umap_1": 9.023836135864258, "region": "Chinese and Mongolian", "recipe": "Easy Chicken Stir-Fry With Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/53890", "like": 0}, {"umap_0": 2.5244975090026855, "umap_1": 11.083739280700684, "region": "Southeast Asian", "recipe": "Java Style Beef Sate", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3443", "like": -1}, {"umap_0": -0.5707054734230042, "umap_1": 7.356980800628662, "region": "UK", "recipe": "Poached Lobster, Scallops and Vegetables with Vermouth", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/102216", "like": 0}, {"umap_0": 0.9268041849136353, "umap_1": 5.902054309844971, "region": "Mexican", "recipe": "South of the Border Steak", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/80384", "like": 1}, {"umap_0": 0.9334036707878113, "umap_1": 8.925015449523926, "region": "Mexican", "recipe": "My Mother's Delicious Chicken Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/89480", "like": -1}, {"umap_0": -1.6990082263946533, "umap_1": 7.760260105133057, "region": "Italian", "recipe": "Spicy Seafood and Penne", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124721", "like": 1}, {"umap_0": -2.514467477798462, "umap_1": 7.685223579406738, "region": "Italian", "recipe": "Tomato Basil Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129930", "like": 1}, {"umap_0": 3.0204155445098877, "umap_1": 9.322197914123535, "region": "Canadian", "recipe": "Sweet &amp; Sour Pork Crepes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147652", "like": 0}, {"umap_0": 10.36502742767334, "umap_1": 6.631762504577637, "region": "Canadian", "recipe": "Sassy's Awesome Banana Bread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145548", "like": 0}, {"umap_0": 11.079668045043945, "umap_1": 8.859467506408691, "region": "US", "recipe": "Billy's Fro-Yo Doggy Treats", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17984", "like": 0}, {"umap_0": 0.6927990317344666, "umap_1": 6.8058271408081055, "region": "Irish", "recipe": "Southern Colcannon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/136192", "like": 0}, {"umap_0": -0.9374880790710449, "umap_1": 9.93418025970459, "region": "US", "recipe": "Bahamian Baked Grouper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18507", "like": 0}, {"umap_0": 0.48706337809562683, "umap_1": 7.97437858581543, "region": "Eastern European", "recipe": "Sauteed Cabbage With Pork", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/100844", "like": 0}, {"umap_0": 11.322461128234863, "umap_1": 8.1932373046875, "region": "Middle Eastern", "recipe": "Fruit &amp; Nut Chocolate Fudge", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/60581", "like": 0}, {"umap_0": 0.5671481490135193, "umap_1": 10.565378189086914, "region": "Northern Africa", "recipe": "Moroccan Peanut Couscous With Peas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/49699", "like": 0}, {"umap_0": 3.1680703163146973, "umap_1": 9.956693649291992, "region": "Chinese and Mongolian", "recipe": "Chinese Steamed Fish in Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/56838", "like": 0}, {"umap_0": -2.275972604751587, "umap_1": 6.432253837585449, "region": "Italian", "recipe": "Easy Bruschetta Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/132308", "like": 1}, {"umap_0": 10.261940956115723, "umap_1": 7.815501689910889, "region": "Australian", "recipe": "Pavlova With a Passionfruit Curd", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/73817", "like": 0}, {"umap_0": 0.1922290325164795, "umap_1": 11.985663414001465, "region": "Australian", "recipe": "Ainsley's Barbecue Spice Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/74235", "like": 0}, {"umap_0": 1.6508421897888184, "umap_1": 11.269798278808594, "region": "Thai", "recipe": "Thai Mango Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/57597", "like": 0}, {"umap_0": -1.129408597946167, "umap_1": 5.371294021606445, "region": "Italian", "recipe": "Four Cheese Fettuccini with Prosciutto and Artichoke Hearts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129038", "like": 1}, {"umap_0": 3.825429677963257, "umap_1": 10.303936958312988, "region": "Korean", "recipe": "Korean Tofu &amp; Leek BBQ", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/69583", "like": 0}, {"umap_0": 0.7130230665206909, "umap_1": 9.434370994567871, "region": "Rest Africa", "recipe": "Riz Gras - Fat Rice (Burkina Faso)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/48626", "like": -1}, {"umap_0": -0.3255823254585266, "umap_1": 11.150347709655762, "region": "Caribbean", "recipe": "Grilled Jamaican Pork Tenderloin Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91688", "like": 1}, {"umap_0": 1.8708499670028687, "umap_1": 8.618474006652832, "region": "Eastern European", "recipe": "Sultszalonnas Krumplisalata (Hot Bacon Potato Salad)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105730", "like": 0}, {"umap_0": -1.586853265762329, "umap_1": 8.754724502563477, "region": "South American", "recipe": "Brazilian Christmas Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/8332", "like": 1}, {"umap_0": 2.263772964477539, "umap_1": 10.496352195739746, "region": "Southeast Asian", "recipe": "Pickled Daikon Radish and Carrot", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3736", "like": 0}, {"umap_0": 10.011534690856934, "umap_1": 7.102960109710693, "region": "UK", "recipe": "Lily Vanilli\u2019s Bacon and Fairtrade Banana Cupcakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/103759", "like": 0}, {"umap_0": 1.1060556173324585, "umap_1": 12.75713062286377, "region": "Australian", "recipe": "Chilli King Prawns (Shrimp) Skewers With Pistachio Coriander Rub", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/77309", "like": 0}, {"umap_0": 1.9740351438522339, "umap_1": 10.085070610046387, "region": "Indian Subcontinent", "recipe": "Chicken in Garlic Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/72231", "like": 0}, {"umap_0": 9.495906829833984, "umap_1": 5.442904472351074, "region": "UK", "recipe": "Shortbread", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/101689", "like": 0}, {"umap_0": -1.821033239364624, "umap_1": 9.254093170166016, "region": "Italian", "recipe": "Drink from the Bowl Spaghetti Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/128890", "like": 1}, {"umap_0": 3.1372594833374023, "umap_1": 9.517512321472168, "region": "Chinese and Mongolian", "recipe": "Moo Goo Gai Pan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/55899", "like": -1}, {"umap_0": -1.1487171649932861, "umap_1": 9.068509101867676, "region": "Thai", "recipe": "Thai Fish Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/59527", "like": 0}, {"umap_0": 10.898775100708008, "umap_1": 6.431585311889648, "region": "Eastern European", "recipe": "Czech Chocolate Pecan Cookies", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/142513", "like": 0}, {"umap_0": 10.298615455627441, "umap_1": 10.19249153137207, "region": "Italian", "recipe": "Jefes Grand Granita", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121594", "like": 0}, {"umap_0": 10.321759223937988, "umap_1": 7.565423965454102, "region": "Italian", "recipe": "Easiest Amaretti Cookies Ever", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121047", "like": 0}, {"umap_0": 3.104684352874756, "umap_1": 10.503979682922363, "region": "Indian Subcontinent", "recipe": "Superb Satay Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/68310", "like": 0}, {"umap_0": 1.2906321287155151, "umap_1": 6.730407238006592, "region": "US", "recipe": "New Mexico Green Chile Breakfast Burritos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/18614", "like": 1}, {"umap_0": 10.791954040527344, "umap_1": 8.142362594604492, "region": "Middle Eastern", "recipe": "Egyptian Palace Bread (Dessert)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/47482", "like": 0}, {"umap_0": -0.23186954855918884, "umap_1": 12.196702003479004, "region": "US", "recipe": "Southern Texas-Style Beef Barbacoa", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17442", "like": -1}, {"umap_0": 1.8301336765289307, "umap_1": 7.218275547027588, "region": "French", "recipe": "French Dip Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116491", "like": -1}, {"umap_0": -0.21403299272060394, "umap_1": 6.477816104888916, "region": "Italian", "recipe": "Healthy Kale Breakfast Frittata", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/122174", "like": 1}, {"umap_0": -0.9929534196853638, "umap_1": 5.950178623199463, "region": "Mexican", "recipe": "Dan's Chicken Enchiladas", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79395", "like": 1}, {"umap_0": 2.2955150604248047, "umap_1": 10.812288284301758, "region": "Southeast Asian", "recipe": "Vietnamese Grilled Catfish with Noodles (Cha Ca)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/62119", "like": 0}, {"umap_0": -2.085153579711914, "umap_1": 8.688108444213867, "region": "Italian", "recipe": "Neapolitan Seafood Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125781", "like": 1}, {"umap_0": 2.82283353805542, "umap_1": 10.811625480651855, "region": "Japanese", "recipe": "The Best Salmon Marinade", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/70475", "like": 1}, {"umap_0": 10.06797981262207, "umap_1": 8.157014846801758, "region": "Italian", "recipe": "Chocolate Torrone", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121096", "like": 0}, {"umap_0": 2.264511823654175, "umap_1": 6.063222885131836, "region": "Mexican", "recipe": "Salsa Verde Pork Burrito Bowls", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/86036", "like": 0}, {"umap_0": -0.2860279977321625, "umap_1": 5.8170294761657715, "region": "US", "recipe": "Jim's Special Cajun Scallops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16194", "like": 1}, {"umap_0": 0.17413212358951569, "umap_1": 10.479493141174316, "region": "Mexican", "recipe": "Poached Huevos Rancheros", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83725", "like": 1}, {"umap_0": 1.1284825801849365, "umap_1": 8.573590278625488, "region": "South American", "recipe": "Island Beef Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/96824", "like": -1}, {"umap_0": 0.02151194028556347, "umap_1": 12.289113998413086, "region": "Mexican", "recipe": "Adobo De Chile", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/79278", "like": 0}, {"umap_0": -1.7122970819473267, "umap_1": 9.194269180297852, "region": "Rest Africa", "recipe": "Ethiopian Potato Salad", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/51570", "like": 0}, {"umap_0": 10.568939208984375, "umap_1": 6.90299654006958, "region": "French", "recipe": "Gingerbread French Toast", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/115028", "like": 0}, {"umap_0": -0.23653945326805115, "umap_1": 6.094488620758057, "region": "Italian", "recipe": "Italian Dressing Croutons", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124750", "like": 0}, {"umap_0": 1.2102420330047607, "umap_1": 10.195693969726562, "region": "US", "recipe": "Cajun Baked French Fries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/15675", "like": 1}, {"umap_0": -0.9183633923530579, "umap_1": 7.708144187927246, "region": "South American", "recipe": "Mexican Stuffed Beef Fillets", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/95756", "like": 1}, {"umap_0": 0.7877466082572937, "umap_1": 13.033647537231445, "region": "Indian Subcontinent", "recipe": "Mattar Paneer", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/65461", "like": 0}, {"umap_0": 10.983002662658691, "umap_1": 7.451921463012695, "region": "French", "recipe": "Frenchy Toast Casserole (French Toast Casserole) Ala the Cassero", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/113201", "like": 0}, {"umap_0": -1.8542791604995728, "umap_1": 5.826737403869629, "region": "Italian", "recipe": "Pasta With Lentils and Arugula (Martha Stewart)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/125029", "like": 1}, {"umap_0": -1.293904185295105, "umap_1": 7.929196357727051, "region": "Canadian", "recipe": "Mustard &amp; Herb Grilled Pork Chops", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145858", "like": 1}, {"umap_0": -0.5371036529541016, "umap_1": 7.5515828132629395, "region": "Middle Eastern", "recipe": "Rice Seasoning Mix", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61720", "like": 0}, {"umap_0": 1.5072715282440186, "umap_1": 9.702645301818848, "region": "US", "recipe": "Barbequed Country Ribs", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16254", "like": 0}, {"umap_0": 11.291780471801758, "umap_1": 6.740469932556152, "region": "Eastern European", "recipe": "Authentic Hungarian Pastries", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105459", "like": 1}, {"umap_0": -2.1887266635894775, "umap_1": 9.002697944641113, "region": "French", "recipe": "Thon En Chartreuse (Tuna With Vegetables and Wine)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/112440", "like": 1}, {"umap_0": -0.9483489990234375, "umap_1": 9.490569114685059, "region": "Southeast Asian", "recipe": "Atsara (Papaya Relish)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/3917", "like": 0}, {"umap_0": 11.806905746459961, "umap_1": 8.421477317810059, "region": "Italian", "recipe": "Fruit and Mascarpone Italian Cheesecake-Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/121506", "like": 1}, {"umap_0": 7.4960174560546875, "umap_1": 7.033345699310303, "region": "Indian Subcontinent", "recipe": "PIAZ PARATHA (Scallion Bread)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/66793", "like": 0}, {"umap_0": 0.6915923953056335, "umap_1": 8.512727737426758, "region": "Canadian", "recipe": "Yummy Crock Pot Pork Tenderloin", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/143692", "like": 0}, {"umap_0": -1.2377650737762451, "umap_1": 5.134418487548828, "region": "Italian", "recipe": "Crock Pot Italian Bow Tie Supper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/124648", "like": 1}, {"umap_0": -1.131126046180725, "umap_1": 5.786356449127197, "region": "South American", "recipe": "Freezer Veggie, Beef and Pasta Bake", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94606", "like": 1}, {"umap_0": 9.31960391998291, "umap_1": 8.406059265136719, "region": "Caribbean", "recipe": "Nutmeg Syrup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91068", "like": 0}, {"umap_0": 1.755436658859253, "umap_1": 5.431872367858887, "region": "Mexican", "recipe": "Pork Tacos", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/6951", "like": 0}, {"umap_0": 0.925565242767334, "umap_1": 6.156373977661133, "region": "South American", "recipe": "Sierra Beef &amp; Rice Supper", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/94679", "like": -1}, {"umap_0": 0.6192034482955933, "umap_1": 6.268477439880371, "region": "Mexican", "recipe": "Mexican Fiesta Casserole", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/83285", "like": 1}, {"umap_0": 2.120532751083374, "umap_1": 5.495871067047119, "region": "Mexican", "recipe": "Latin-Inspired Spicy Cream Chicken Stew", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7142", "like": 1}, {"umap_0": -1.962687373161316, "umap_1": 10.653576850891113, "region": "French", "recipe": "Asparagus With Lemon and Tarragon", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/116117", "like": 1}, {"umap_0": 10.724434852600098, "umap_1": 9.489255905151367, "region": "Australian", "recipe": "Very Berry Fruit Terrine With Maple Custard", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76460", "like": 0}, {"umap_0": -0.3602024018764496, "umap_1": 6.830756664276123, "region": "Italian", "recipe": "Rich Italian Sausage and Potato Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/10279", "like": 0}, {"umap_0": 2.144357204437256, "umap_1": 10.131176948547363, "region": "Thai", "recipe": "Thai Yellow Curry - Vegan", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/58072", "like": -1}, {"umap_0": -0.45959705114364624, "umap_1": 8.560117721557617, "region": "Canadian", "recipe": "Slow Cooker Vegetable Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/147845", "like": 0}, {"umap_0": 0.5746748447418213, "umap_1": 6.756824970245361, "region": "French", "recipe": "Creamy Seafood Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/114655", "like": 0}, {"umap_0": 2.560509204864502, "umap_1": 9.790608406066895, "region": "Indian Subcontinent", "recipe": "Beets With Tropical Flavours", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/64534", "like": 0}, {"umap_0": 0.7873938679695129, "umap_1": 8.498421669006348, "region": "US", "recipe": "Cindi's Jambalaya Cajun Pot Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/14549", "like": 0}, {"umap_0": 0.8876485228538513, "umap_1": 5.796229362487793, "region": "US", "recipe": "Alaskan Spicy Spinach Dip", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/17288", "like": 1}, {"umap_0": 0.12840768694877625, "umap_1": 5.337678909301758, "region": "South American", "recipe": "Sopa De Pan En Cazuela (Columbian Bread Pot Soup)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/93660", "like": 1}, {"umap_0": 1.5170148611068726, "umap_1": 4.559617519378662, "region": "Mexican", "recipe": "Taco Burgers", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/78789", "like": 1}, {"umap_0": -0.2012803554534912, "umap_1": 11.142373085021973, "region": "Middle Eastern", "recipe": "Baked Couscous with Tomatoes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/61619", "like": 1}, {"umap_0": -2.366230010986328, "umap_1": 9.31108570098877, "region": "Italian", "recipe": "Sicilian Style Pasta Sauce W-Meat or Not", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/123166", "like": 1}, {"umap_0": -2.0142886638641357, "umap_1": 7.254846096038818, "region": "Italian", "recipe": "Artichoke,fresh Mozzarella Salami Sandwiches (Panini)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129260", "like": 1}, {"umap_0": 9.940065383911133, "umap_1": 6.373366832733154, "region": "Scandinavian", "recipe": "Mom's Pepparkakor (Gingerbread Cookies)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/105879", "like": 0}, {"umap_0": 9.190016746520996, "umap_1": 8.331125259399414, "region": "Canadian", "recipe": "Blueberry Wild Rice", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145393", "like": 0}, {"umap_0": 8.31847095489502, "umap_1": 7.207501411437988, "region": "Indian Subcontinent", "recipe": "Whole Wheat Bread With Ginger &amp; Cashews for ABM", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/67936", "like": 0}, {"umap_0": -2.2441775798797607, "umap_1": 9.45218276977539, "region": "Belgian", "recipe": "Warm Potato Salad With Lemon and Chive Vinaigrette", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/134528", "like": 1}, {"umap_0": -1.4820746183395386, "umap_1": 6.55026912689209, "region": "Italian", "recipe": "Italian Tuna Panini", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/129035", "like": 1}, {"umap_0": 10.330740928649902, "umap_1": 7.4398393630981445, "region": "French", "recipe": "French Vanilla Slices (Mille-feuilles)", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/9736", "like": 0}, {"umap_0": -2.349660873413086, "umap_1": 9.312090873718262, "region": "Spanish and Portuguese", "recipe": "Winter Gazpacho", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13673", "like": 1}, {"umap_0": -0.73268061876297, "umap_1": 4.771425724029541, "region": "Italian", "recipe": "Spinach Calzone", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/131662", "like": 1}, {"umap_0": 1.6492486000061035, "umap_1": 7.56981897354126, "region": "Australian", "recipe": "GRPA's Best Homemade Turkey Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/76297", "like": 0}, {"umap_0": -0.5097493529319763, "umap_1": 7.268265724182129, "region": "US", "recipe": "Bonnie's Crab Cakes", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/16800", "like": 0}, {"umap_0": 11.181482315063477, "umap_1": 6.232764720916748, "region": "Eastern European", "recipe": "Raised Polish Doughnuts Paczki", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/133510", "like": 0}, {"umap_0": 10.606070518493652, "umap_1": 6.4818501472473145, "region": "Mexican", "recipe": "Mexican Almond Hot Chocolate", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82431", "like": 0}, {"umap_0": 2.331252098083496, "umap_1": 6.2352399826049805, "region": "Mexican", "recipe": "Bean and Beef Shaloupias", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/7429", "like": 1}, {"umap_0": 0.6237757205963135, "umap_1": 8.454628944396973, "region": "Canadian", "recipe": "BBQ &amp; Garlic Crusted Roast Sirloin W- Wine Gravy", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/145066", "like": -1}, {"umap_0": -1.0780130624771118, "umap_1": 11.72840404510498, "region": "Indian Subcontinent", "recipe": "Spicy Yogurt Dressing", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/4666", "like": 0}, {"umap_0": 11.454381942749023, "umap_1": 8.859098434448242, "region": "Caribbean", "recipe": "Jamaica No Mistake Mon Coffee", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/91618", "like": 0}, {"umap_0": -1.110107183456421, "umap_1": 5.792154312133789, "region": "Italian", "recipe": "Italian Style Turkey Meatloaf", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/12654", "like": 0}, {"umap_0": 2.2700493335723877, "umap_1": 7.130560874938965, "region": "Greek", "recipe": "Kalasouna - Cheese and Onion Pie", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/110763", "like": 1}, {"umap_0": -0.15274086594581604, "umap_1": 10.90386962890625, "region": "Mexican", "recipe": "Spicy Posole Soup", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/82680", "like": 0}, {"umap_0": -1.0216495990753174, "umap_1": 8.602620124816895, "region": "Irish", "recipe": "Kenmare Bay Mussels With Wine-Cream Sauce", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/135065", "like": 0}, {"umap_0": 9.59228801727295, "umap_1": 5.764557361602783, "region": "Deutschland", "recipe": "Chocolate Hearts", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/138807", "like": 0}, {"umap_0": 10.613481521606445, "umap_1": 6.488113880157471, "region": "US", "recipe": "Mississippi Mud Cake I", "url": "https://cosylab.iiitd.edu.in/recipedb/search_recipeInfo/13791", "like": 0}]}};
var embedOpt = {"mode": "vega-lite"};

function showError(el, error){
el.innerHTML = ('&lt;div class="error" style="color:red;"&gt;'
+ '&lt;p&gt;JavaScript Error: ' + error.message + '&lt;/p&gt;'
+ "&lt;p&gt;This usually means there's a typo in your chart specification. "
+ "See the javascript console for the full traceback.&lt;/p&gt;"
+ '&lt;/div&gt;');
throw error;
}
const el = document.getElementById('vis');
vegaEmbed("#vis", spec, embedOpt)
.catch(error =&gt; showError(el, error));
})(vegaEmbed);

&lt;/script&gt;

&lt;h2 id="scandinavia-who-wouldve-thought"&gt;Scandinavia, who would've thought&lt;/h2&gt;
&lt;p&gt;The second step is to score each "unseen" item (e.g., recipes that I don't know if I will like or not). To do so - we need to look at its neighbours. Remember, we're talking about vectors, so neighbours in this context means similar vectors. Let's take the &lt;a href="https://en.wikipedia.org/wiki/Cosine_similarity"&gt;cosine similarity&lt;/a&gt; of two vectors as the measure of their similarity.&lt;/p&gt;
&lt;p&gt;The process to score an "unseen" recipe is then:
1. Identify its K (e.g., 10) closest neighbours
2. Calculate the average score of the neighbours (e.g., &lt;code&gt;+1&lt;/code&gt; if we like, &lt;code&gt;-1&lt;/code&gt; if we don't)
3. If we want to get fancy, we can weight that average with the distance to the unseen item&lt;/p&gt;
&lt;p&gt;If you're the visual kind:&lt;/p&gt;
&lt;p&gt;&lt;img alt="agarwall" style="border-radius: 5px;" src="https://duarteocarmo.com/images/40/agarwall.png"/&gt;&lt;/p&gt;
&lt;p&gt;With the above methodology, we can score every single "unseen" recipe. We scored every liked and disliked recipe as a &lt;code&gt;-1&lt;/code&gt; or a &lt;code&gt;+1&lt;/code&gt;.  Therefore, every unlabelled recipe will have a score between these two numbers. If that score is above &lt;code&gt;0&lt;/code&gt;, I am more likely to &lt;em&gt;like&lt;/em&gt; the recipe, if not, I'm  more likely to &lt;em&gt;dislike&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;But wait. This sounds computationally expensive.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# count the likes and dislikes&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;like&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value_counts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 0    54905&lt;/span&gt;

&lt;span class="c1"&gt;# 1    48661&lt;/span&gt;

&lt;span class="c1"&gt;#-1    12091&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This means that for each one of the 55.000 unseen recipes, we need to calculate the distance to each one of the 60.752 seen recipes. And only then determine which are the K closest neighbours. Yeah that could take a long time.&lt;/p&gt;
&lt;h2 id="quick-maths-literally"&gt;Quick maths - literally&lt;/h2&gt;
&lt;p&gt;Fortunately - Facebook's AI team has tackled this challenge. Not only that, they've shared it as an &lt;a href="https://github.com/facebookresearch/faiss"&gt;open-source package&lt;/a&gt;. Faiss, is an open -source library that does exactly what we need. It finds the most similar K vectors to the one(s) you're searching for.&lt;/p&gt;
&lt;p&gt;We're not talking about 116.000 recipes anymore. That will take less than 2 seconds on my machine:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# start by taking the labeled and unlabelled embeddings&lt;/span&gt;

&lt;span class="n"&gt;labeled_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedding_matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;labeled_indices&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:]&lt;/span&gt;
&lt;span class="n"&gt;unlabeled_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedding_matrix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;unlabeled_indices&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:]&lt;/span&gt;

&lt;span class="c1"&gt;# create our index&lt;/span&gt;

&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;labeled_embeddings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;faiss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IndexFlatIP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# normalize and add our labeled recipes&lt;/span&gt;

&lt;span class="n"&gt;faiss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normalize_L2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labeled_embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labeled_embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# normalize and query for our unlabelled recipes&lt;/span&gt;

&lt;span class="n"&gt;faiss&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normalize_L2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unlabeled_embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;r_distances&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r_indexes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unlabeled_embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;On larger datasets (we're talking 1M+ vectors) the exact neighbour search can be very computationally expensive.&lt;/p&gt;
&lt;p&gt;With FAISS, we able to speed that up. We can do &lt;em&gt;approximate&lt;/em&gt; searches to save time, we can make these calculations on GPUs. It's an awesome tool, with a great &lt;a href="https://github.com/facebookresearch/faiss/"&gt;wiki&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="icelandic-caramel-potatoes"&gt;Icelandic Caramel Potatoes&lt;/h2&gt;
&lt;p&gt;Yup - back to food.&lt;/p&gt;
&lt;p&gt;After some data wrangling, I was able to give a weighted recommendation to every recipe in the network:&lt;/p&gt;
&lt;p&gt;&lt;img alt="recommendations" style="border-radius: 5px;" src="https://duarteocarmo.com/images/40/recommendations.png"/&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;distances&lt;/code&gt;: the distances that each one of the 10 neighbours has to the recipe in the row&lt;/li&gt;
&lt;li&gt;&lt;code&gt;scores&lt;/code&gt;: the scores of the 10 closest neighbours to the recipe (+1 if I like, -1 if I dislike)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;score_normal&lt;/code&gt;: The average score of the 10 neighbours&lt;/li&gt;
&lt;li&gt;&lt;code&gt;score_weighted&lt;/code&gt;: &lt;code&gt;score_normal&lt;/code&gt;, weighted by &lt;code&gt;distance&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As you can see above - I don't think I would like that &lt;em&gt;Suya&lt;/em&gt; recipe. But both French recipes on the table have perfect 1.0 scores - I should like them. In principle of course. &lt;em&gt;Pear Brandy Souffle&lt;/em&gt; looks like a party I wouldn't like to get into - but hey, you only know if you try it.&lt;/p&gt;
&lt;p&gt;There's another interesting question we can explore: What's my most recommended region? What is the region that suits my tastes the most? Pandas to the rescue:&lt;/p&gt;
&lt;p&gt;&lt;img alt="recipe_regions" style="border-radius: 5px;" src="https://duarteocarmo.com/images/40/regions.png"/&gt;&lt;/p&gt;
&lt;p&gt;Looks like French cuisine has the highest average recipe score. But Scandinavia is a close second! Who would've thought that? Maybe I do belong in Scandinavia.&lt;/p&gt;
&lt;h2 id="the-return-of-simple-tools"&gt;The return of simple tools&lt;/h2&gt;
&lt;p&gt;Transparency is one of the biggest advantages of this method. By looking at the closest neighbours column - we are able to know &lt;em&gt;how&lt;/em&gt; a certain recommendation was made. With so many ML algorithms and transformers out there - interpretability &lt;em&gt;is not an easy thing to land&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Machine learning has made incredible advancements in the last 3/5 years. We have transformers, deep architectures, massive multi-billion parameter models serving users. Amidst all the advancements it's easy to get lost in adopting always the new shiny thing. Even if we don't understand how that shiny thing works.&lt;/p&gt;
&lt;p&gt;There's a certain beauty to simple tools. Simple, explainable, and interpretable algorithms are a joy. And if &lt;a href="/blog/simple-software.html"&gt;simple is good&lt;/a&gt; - why complicate?&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Serving models with FastAPI: It's not just about the speed</title><link href="https://duarteocarmo.com/blog/serving-ml-models-fastapi.html" rel="alternate"/><published>2022-06-10T11:00:00+02:00</published><updated>2022-06-10T11:00:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-06-10:/blog/serving-ml-models-fastapi.html</id><summary type="html">&lt;p&gt;&lt;em&gt;Disclaimer: This post was originally published in the &lt;a href="http://blog.amplemarket.com/"&gt;Amplemarket blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Serving and deploying Machine Learning models is a topic that can get complicated quite fast. At Amplemarket, my team and I, like to keep things simple.&lt;/p&gt;
&lt;p&gt;Let's talk about it.&lt;/p&gt;
&lt;h2 id="simplicity-is-key"&gt;Simplicity is key&lt;/h2&gt;
&lt;p&gt;At &lt;a href="https://amplemarket.com/"&gt;Amplemarket&lt;/a&gt;, we’re big fans of …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;em&gt;Disclaimer: This post was originally published in the &lt;a href="http://blog.amplemarket.com/"&gt;Amplemarket blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Serving and deploying Machine Learning models is a topic that can get complicated quite fast. At Amplemarket, my team and I, like to keep things simple.&lt;/p&gt;
&lt;p&gt;Let's talk about it.&lt;/p&gt;
&lt;h2 id="simplicity-is-key"&gt;Simplicity is key&lt;/h2&gt;
&lt;p&gt;At &lt;a href="https://amplemarket.com/"&gt;Amplemarket&lt;/a&gt;, we’re big fans of simplifying. We know that the less code we have to write, the fewer bugs we're likely to introduce. &lt;a href="https://fastapi.tiangolo.com/"&gt;FastAPI&lt;/a&gt; allows us to take a trained model and create an API for it, in less than 10 lines of code! That’s pretty incredible.&lt;/p&gt;
&lt;p&gt;Let’s say you are creating a model that is trained on the Iris data set that predicts the species of a plant given some information about it. Previously, we would have saved our model to a &lt;code&gt;pickle&lt;/code&gt; file, and sent it to our engineering team; perhaps we would have even scheduled a meeting with them to discuss how to use this model. We would also have sent some documentation on how to use that model and how it can inference on some data.&lt;/p&gt;
&lt;p&gt;But the likelihood that something will go wrong in one of those steps is greater than we’d like.
Every time the model gets updated, we would need to have another discussion, about how the model has changed, and what gets improved - documentation would get outdated, etc.&lt;/p&gt;
&lt;p&gt;To avoid dealing with that house of cards, we at Amplemarket have settled on a process that significantly reduces the complexity of deploying the machine learning models we develop.&lt;/p&gt;
&lt;p&gt;As an example, the following snippet trains a classifier and saves it to the model.joblib file:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: To run the snippets, make sure to &lt;code&gt;python -m pip install scikit-learn fastapi "uvicorn[standard]"&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;svm&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;joblib&lt;/span&gt;

&lt;span class="c1"&gt;# create model&lt;/span&gt;

&lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;svm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SVC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probability&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_iris&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;return_X_y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# save model&lt;/span&gt;

&lt;span class="n"&gt;joblib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;model.joblib&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With &lt;a href="https://fastapi.tiangolo.com/"&gt;FastAPI&lt;/a&gt;, serving your model is as simple as creating an &lt;code&gt;app.py&lt;/code&gt; file with the following contents:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# app.py&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;joblib&lt;/span&gt;

&lt;span class="c1"&gt;# create FastAPI app and load model&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;joblib&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;model.joblib&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# create an endpoint that receives POST requests&lt;/span&gt;

&lt;span class="c1"&gt;# and returns predictions&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nb"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;predictions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can then launch the API with &lt;code&gt;uvicorn app:app --reload&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Such a simple setup to launch an API ensures that our Machine Learning team is not limited to creating these models, but also responsible for making them available to our development team, or even directly to our users.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;“You built it? You ship it”&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="reduce-the-need-for-coordination"&gt;Reduce the need for coordination&lt;/h2&gt;
&lt;p&gt;As a remote, distributed, and asynchronous team, documentation is huge for us. We don’t want to have a meeting with 5 other departments every time we build or ship a new model. We want to document as best we can, with the least effort possible.&lt;/p&gt;
&lt;p&gt;With the script described above, if you visit &lt;code&gt;localhost:8000/docs&lt;/code&gt; you’ll notice that your API includes some documentation out of the box:&lt;/p&gt;
&lt;p&gt;&lt;img alt="API-screenshot-1" src="https://duarteocarmo.com/images/39/1.png" /&gt;&lt;/p&gt;
&lt;p&gt;These docs already tell a lot about the application: what endpoints it has, what type of queries the user can send, etc. We love FastAPI because it takes little work to enrich the documentation further and minimize the need for future coordination.&lt;/p&gt;
&lt;p&gt;Let’s add a couple of things to our &lt;code&gt;app.py&lt;/code&gt; script:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;joblib&lt;/span&gt;

&lt;span class="c1"&gt;# some documentation in markdown&lt;/span&gt;

&lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

&lt;span class="s2"&gt;## Documentation&lt;/span&gt;

&lt;span class="s2"&gt;**ℹ️ Read carefully before using**&lt;/span&gt;

&lt;span class="s2"&gt;This api allows you to predict the type of Iris plant given a list of features.&lt;/span&gt;
&lt;span class="s2"&gt;The features should be:&lt;/span&gt;
&lt;span class="s2"&gt;* sepal length in cm&lt;/span&gt;
&lt;span class="s2"&gt;* sepal width in cm&lt;/span&gt;
&lt;span class="s2"&gt;* petal length in cm&lt;/span&gt;
&lt;span class="s2"&gt;* petal width in cm&lt;/span&gt;

&lt;span class="s2"&gt;_Build by:_&lt;/span&gt;
&lt;span class="s2"&gt;![logo](&amp;lt;https://amplemarket.com/_next/image?url=&lt;/span&gt;&lt;span class="si"&gt;%2F&lt;/span&gt;&lt;span class="s2"&gt;svg&lt;/span&gt;&lt;span class="si"&gt;%2F&lt;/span&gt;&lt;span class="s2"&gt;logo.svg&amp;amp;w=384&amp;amp;q=75&amp;gt;)&lt;/span&gt;
&lt;span class="s2"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# create FastAPI app and load model&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;IRIS Classification&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;0.1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contact&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Amplemarket&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;url&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;lt;https://amplemarket.com&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;email&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;support@yourcompany.com&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# ... rest of the file ....&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If we now turn back to our docs, we see that the markdown we've added has been rendered at the top of the page. Now, when someone needs to check some details about this model, all the information that person might need is neatly described - and we even provide a support email if they run into trouble!&lt;/p&gt;
&lt;p&gt;&lt;img alt="API-screenshot-2" src="https://duarteocarmo.com/images/39/2.png" /&gt;&lt;/p&gt;
&lt;p&gt;But FastAPI doesn’t stop there.&lt;/p&gt;
&lt;p&gt;With a couple more lines of code, and thanks to a small library called &lt;a href="https://pydantic-docs.helpmanual.io/"&gt;Pydantic&lt;/a&gt;, we can also add data validation to our model’s API. By doing so, API users will know what kind of data it expects to receive and what kind of data the API will respond back with.&lt;/p&gt;
&lt;p&gt;We start by creating two classes, one to handle requests, and the other for responses:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# We&amp;#39;ll take this in:&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sepal_length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# ensures values  are between 0 and 1&lt;/span&gt;
    &lt;span class="n"&gt;sepal_width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;petal_length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;petal_width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# with an example&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;schema_extra&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;example&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;sepal_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;sepal_width&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;petal_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;petal_width&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# We&amp;#39;ll respond something like this:&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;setosa_probability&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;versicolor_probability&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;virginica_probability&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;confloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# with an example&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;schema_extra&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;&amp;quot;example&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;setosa_probability&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;versicolor_probability&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s2"&gt;&amp;quot;virginica_probability&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And we tweak our endpoint code:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;# the endpoint&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;/predict/&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Features&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;feature_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sepal_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sepal_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;petal_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sepal_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict_proba&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;feature_list&lt;/span&gt;&lt;span class="p"&gt;])[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;predictions_clean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;setosa_probability&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;versicolor_probability&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;virginica_probability&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;predictions_clean&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The two classes above are particularly strict about what our API can receive and what it will respond back with. Suppose a developer tries to query the API with a &lt;code&gt;petal_width&lt;/code&gt; of 1.3. Because we’ve specified that petal width must be a number between 0.0 and 1.1, our API will reject the query and reply back with:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="c1"&gt;## request&lt;/span&gt;

&lt;span class="n"&gt;curl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;POST&amp;#39;&lt;/span&gt; \\
  &lt;span class="s1"&gt;&amp;#39;&amp;lt;http://localhost:8000/predict/&amp;gt;&amp;#39;&lt;/span&gt; \\
  &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;accept: application/json&amp;#39;&lt;/span&gt; \\
  &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Content-Type: application/json&amp;#39;&lt;/span&gt; \\
  &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;{&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;sepal_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;sepal_width&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;petal_length&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;petal_width&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;

&lt;span class="c1"&gt;## response&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;&amp;quot;detail&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;&amp;quot;loc&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;body&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;petal_width&amp;quot;&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="s2"&gt;&amp;quot;msg&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;ensure this value is less than or equal to 1.0&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;&amp;quot;type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;value_error.number.not_le&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;&amp;quot;ctx&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;&amp;quot;limit_value&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As an added bonus, the model Config classes also help provide developers with an example request and response:&lt;/p&gt;
&lt;p&gt;&lt;img alt="API-screenshot-3" src="https://duarteocarmo.com/images/39/3.png" /&gt;&lt;/p&gt;
&lt;p&gt;With this documentation page at hand, our users know exactly what our API is, what it &lt;em&gt;expects&lt;/em&gt;, and what it will &lt;em&gt;reply back&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;All of this comes without us having to invest much time and effort into ensuring the documentation has all the information that future developers might need. And notice how we didn’t have to write any extra documentation.&lt;/p&gt;
&lt;p&gt;Our documentation &lt;em&gt;is&lt;/em&gt; our code.&lt;/p&gt;
&lt;h2 id="make-it-fast-enough"&gt;Make it fast (enough)&lt;/h2&gt;
&lt;p&gt;Python is far from the fastest language out there, nor does it claim to be. We don’t use Python for its speed, but for its ecosystem, especially as it relates to data science and its many needs.&lt;/p&gt;
&lt;p&gt;Even with &lt;a href="https://andrewbrookins.com/python/is-fastapi-a-fad/"&gt;several&lt;/a&gt; &lt;a href="https://fastapi.tiangolo.com/benchmarks/"&gt;claims&lt;/a&gt; that FastAPI is a very performant web framework, we know we’re not using the &lt;a href="https://www.techempower.com/benchmarks/"&gt;fastest web framework&lt;/a&gt; out there.&lt;/p&gt;
&lt;p&gt;However, even if FastAPI was slower than it currently is, we would still be willing to compromise that speed for time-to-market, documentation, and ease of use.&lt;/p&gt;
&lt;p&gt;When deploying, &lt;a href="https://tiangolo.com/"&gt;Sebastián Ramírez&lt;/a&gt; offers a &lt;a href="https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker"&gt;FastAPI/uvicorn high-performance docker image&lt;/a&gt; with auto-tuning. Allowing the app to scale according to the number of available CPU cores on the machine it's running on:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;tiangolo/uvicorn-gunicorn-fastapi:python3.9&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;./requirements.txt&lt;span class="w"&gt; &lt;/span&gt;/app/requirements.txt
&lt;span class="k"&gt;RUN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;--no-cache-dir&lt;span class="w"&gt; &lt;/span&gt;--upgrade&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="w"&gt; &lt;/span&gt;/app/requirements.txt
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;./app&lt;span class="w"&gt; &lt;/span&gt;/app
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If you’re running on Kubernetes or something like that, you &lt;a href="https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker#-warning-you-probably-dont-need-this-docker-image"&gt;probably don’t need this image&lt;/a&gt; - but if you have a simpler setup, this image will come very in handy!&lt;/p&gt;
&lt;h2 id="deploy-with-confidence"&gt;Deploy with confidence&lt;/h2&gt;
&lt;p&gt;At &lt;a href="https://amplemarket.com/"&gt;Amplemarket&lt;/a&gt; we like to adopt best practices from Software Development and apply them to our Machine Learning projects. That means every model we develop and deploy is version controlled, tested, and continuously deployed.&lt;/p&gt;
&lt;p&gt;Using CI/CD automation, we can continuously deploy our model and code to several targets (e.g., staging and production). Allowing us to serve different versions of our model in different endpoints, and roll back with ease.&lt;/p&gt;
&lt;p&gt;FastAPI also allows us to &lt;a href="https://fastapi.tiangolo.com/tutorial/testing/"&gt;easily test&lt;/a&gt; our code. This is especially important when we are inferencing. What if we receive a different set of numbers? Will we make a prediction? What if the user sends us a string, and it gets evaluated as a float? How do we account for that? Testing matters. And FastAPI allows us to do it with ease.&lt;/p&gt;
&lt;h2 id="closing-thoughts"&gt;Closing thoughts&lt;/h2&gt;
&lt;p&gt;FastAPI has been particularly valuable when serving our Machine Learning models. Our development team is especially happy with the high level of documentation and data validation that our APIs offer and our users also get those benefits.&lt;/p&gt;
&lt;p&gt;Thanks to FastAPI, we’ve been able to predict with confidence, and I hope you got some inspiration from this post to gain confidence as well.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note: This is also a show of appreciation to the incredible work of FastAPI’s lead developer &lt;a href="https://tiangolo.com/"&gt;Sebastián Ramírez&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Mac apps you didn't know you needed</title><link href="https://duarteocarmo.com/blog/incredible-mac-apps.html" rel="alternate"/><published>2022-04-12T09:30:00+02:00</published><updated>2022-04-12T09:30:00+02:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-04-12:/blog/incredible-mac-apps.html</id><summary type="html">&lt;p&gt;I'm a bit of an &lt;a href="https://www.urbandictionary.com/define.php?term=iSheep"&gt;iSheep&lt;/a&gt;. I didn't even know that term existed until a colleague used it on me a couple of years ago. But this is not an Apple vs. Windows (vs. Linux) post. I like my Mac, you might prefer your Windows or Linux machine - that's fine …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'm a bit of an &lt;a href="https://www.urbandictionary.com/define.php?term=iSheep"&gt;iSheep&lt;/a&gt;. I didn't even know that term existed until a colleague used it on me a couple of years ago. But this is not an Apple vs. Windows (vs. Linux) post. I like my Mac, you might prefer your Windows or Linux machine - that's fine. This post focuses on Mac apps.&lt;/p&gt;
&lt;p&gt;I've worked remotely for almost 3 years now - and the importance of little things starts showing. Those things that make my day easier, more pleasant, more productive. From the &lt;a href="/blog/how-to-work-from-home-revisited"&gt;desk I sit at&lt;/a&gt;, to the &lt;a href="/blog/vim-for-python-development-and-not-only"&gt;editor&lt;/a&gt; I use, these things &lt;em&gt;matter&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;There's a certain &lt;em&gt;magic&lt;/em&gt; to software that makes everyday work just a &lt;em&gt;little&lt;/em&gt; more enjoyable and productive. Here's a list of apps that help me achieve just that.&lt;/p&gt;
&lt;h3 id="meetingbar"&gt;&lt;a target="_blank" href="https://meetingbar.onrender.com/"&gt;MeetingBar&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;There are few things in life I hate more than an Outlook notification 10 minutes before my next meeting. Especially in a remote setting. Yes, I want to know when the next meeting will take place - but without everything buzzing all the time. MeetinBar does just that. Your next meeting in your menu bar. When it's time - hit your shortcut, hit it, and you're there.&lt;/p&gt;
&lt;h3 id="itsycal"&gt;&lt;a target="_blank" href="https://www.mowglii.com/itsycal/"&gt;Itsycal&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I like Apple's Calendar app (yeah, iSheep - I told you) - but during a workday, often, you just want to check a simple date. "Are you available on the 5th of July?", "What are you doing on week #34?". These are though questions. Itsycal answers all those, right in your menu bar, in a couple of clicks.&lt;/p&gt;
&lt;h3 id="maestral"&gt;&lt;a target="_blank" href="https://maestral.app/"&gt;Maestral&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I still use Dropbox. It's not the best Cloud service, but it was the first I used. There's one thing I can't stand though- their Mac app - always trying to push the next big Dropbox thing on me, eating my RAM, or annoying me with pop ups. I &lt;em&gt;just&lt;/em&gt; want file sync. Maestral is a lightweight and open source alternative client that does Dropbox file sync - and nothing more - without annoying me or eating my RAM. 100% open-source - what an app.&lt;/p&gt;
&lt;h3 id="clocker"&gt;&lt;a target="_blank" href="https://abhishekbanthia.com/clocker/"&gt;Clocker&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Who likes dealing with time zones? I certainly don't. 8AM in Copenhagen, it's 7AM in Lisbon, which in Singapore means...? I have no clue. Clocker solves just that. In a few clicks I can select a time, see what it corresponds to in other time zones, and copy all those to my clipboard. Done. No more timezone hell.&lt;/p&gt;
&lt;h3 id="raycast"&gt;&lt;a target="_blank" href="https://www.raycast.com/"&gt;Raycast&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For years Spotlight search was good enough. Then I started using &lt;a href="https://www.alfredapp.com/"&gt;Alfred&lt;/a&gt;, which was pretty awesome. Now I use Raycast, which is even better. I think of it as Spotlight on steroids (I really hate that expression). I can find files super fast, access my clipboard history, search messages on Slack, search my Google Drive, tweet, check my schedule.. You get my point - go give it a shot.&lt;/p&gt;
&lt;h3 id="podcastmenu"&gt;&lt;a target="_blank" href="https://github.com/insidegui/PodcastMenu"&gt;PodcastMenu&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://overcast.fm/"&gt;Overcast&lt;/a&gt; is my favourite podcast app. When working at my desk - I listen to podcasts sometimes, &lt;em&gt;especially&lt;/em&gt; if I'm doing something repetitive (e.g., bookkeeping for example). PodcastMenu puts Overcast right in my menu bar. I also syncs perfectly with the mobile version - so that I can keep listening my pods when I leave my place.&lt;/p&gt;
&lt;h3 id="textsniper"&gt;&lt;a target="_blank" href="https://textsniper.app/"&gt;TextSniper&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Someone sends me a bad scan of a document and there's a number I need to copy from it. There's a weird pdf that I need to copy a reference from. What do I do? I used to look at them one-by-one and copy them over. Now.. I just hit CMD+Shift+2 and this uses OCR for me to copy the text from it. This has saved me &lt;em&gt;at least&lt;/em&gt; an hour of my life.&lt;/p&gt;
&lt;h3 id="rectangle"&gt;&lt;a target="_blank" href="https://rectangleapp.com/"&gt;Rectangle&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We agree. Windows management in MacOS is.. Lacking to say the least. For a long time, I used &lt;a href="https://www.spectacleapp.com/"&gt;Spectacle&lt;/a&gt; to manage my windows, and I really liked it. Now that Spectacle is not in active development anymore, the predecessor seems to be Rectangle. A free and open-source option, which has served me very well, and I can definetly recommend.&lt;/p&gt;
&lt;h3 id="stats"&gt;&lt;a target="_blank" href="https://github.com/exelban/stats"&gt;Stats&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;OK, you don't exactly &lt;em&gt;need&lt;/em&gt; this one. But I like to glance at my menu bar app to see my RAM, SSD, and CPU stats. Particularly if I'm doing something in the &lt;em&gt;heavier&lt;/em&gt; side. Stats is a free and open source alternative to the popular &lt;a href="https://bjango.com/mac/istatmenus/"&gt;iStat Menus&lt;/a&gt; - that sticks to the basics. It allows me to customize whatever I want to see in my menu bar and keep an eye on the state of my Mac.&lt;/p&gt;
&lt;h2 id="final-thoughts"&gt;Final thoughts&lt;/h2&gt;
&lt;p&gt;You'll notice that &lt;em&gt;at least&lt;/em&gt; 7 out 9 applications I mentioned are free &lt;em&gt;and&lt;/em&gt; open source. Not only that, but the developers are also regularly pushing updates and making these apps better every day. It's incredible how good a product of passion can become. This post is my way of thanking them for their awsome work. Don't forget to also support them.&lt;/p&gt;</content><category term="blog"/></entry><entry><title>A simple system to stay in touch with people</title><link href="https://duarteocarmo.com/blog/infrequent-tiny-crm.html" rel="alternate"/><published>2022-03-21T14:30:00+01:00</published><updated>2022-03-21T14:30:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-03-21:/blog/infrequent-tiny-crm.html</id><summary type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/37/infrequent_screenshot.png" alt="Embeddings" style="max-width:100%;"&gt;&lt;/p&gt;
&lt;p&gt;Someone once told me that my professional network is one of the most important things I'll build throughout my career. For years I've struggled to find a system that would help me stay in touch with my network. So I decided to build my own - let me show you how …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;img src="https://duarteocarmo.com/images/37/infrequent_screenshot.png" alt="Embeddings" style="max-width:100%;"&gt;&lt;/p&gt;
&lt;p&gt;Someone once told me that my professional network is one of the most important things I'll build throughout my career. For years I've struggled to find a system that would help me stay in touch with my network. So I decided to build my own - let me show you how it works.&lt;/p&gt;
&lt;h2 id="the-idea"&gt;The idea&lt;/h2&gt;
&lt;p&gt;The idea is pretty simple, you want to stay in touch with a group of people, by talking to them every X days/months/years. Let's say you want to keep up with Jane every two months. If two months have passed and you haven't talked, you should get nudged into doing it.&lt;/p&gt;
&lt;p&gt;To solve this, I started with a recurring item in my to-do list, but it quickly got messy. I tested &lt;a href="https://getdex.com/"&gt;Dex&lt;/a&gt;, but they would only allow me to keep tabs on 15 people. &lt;a href="https://www.monicahq.com/"&gt;Monica&lt;/a&gt; looks like a fun deployment challenge - that's not what I'm looking for.&lt;/p&gt;
&lt;p&gt;What I really wanted was something &lt;em&gt;simple&lt;/em&gt;. Something like other &lt;a href="https://kindle-highlights.email/"&gt;small things&lt;/a&gt; I've built before? I bet I could build something simple using a bit of Python.&lt;/p&gt;
&lt;h2 id="how-it-works"&gt;How it works&lt;/h2&gt;
&lt;p&gt;At the core of the tool is a simple folder. In it, every person that I want to stay in touch with, gets their own markdown file:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;├──&lt;span class="w"&gt; &lt;/span&gt;infrequent.py
└──&lt;span class="w"&gt; &lt;/span&gt;people
&lt;span class="w"&gt;    &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;john-snow.md
&lt;span class="w"&gt;    &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;arya-stark.md
&lt;span class="w"&gt;    &lt;/span&gt;├──&lt;span class="w"&gt; &lt;/span&gt;...
&lt;span class="w"&gt;    &lt;/span&gt;└──&lt;span class="w"&gt; &lt;/span&gt;sansa-stark.md
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;At the top of the file: the person's name, our relationship, and the frequency at which we should keep in touch. After that, every entry is a header with a date of contact, and some notes about what happened in that date. Here's an example:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;Name: John Snow
Relationship: Idol
Interval(every): 2 months

&lt;span class="gu"&gt;## 21-03-2022&lt;/span&gt;

&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Was recruiting for more troops
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Dead walkers in the wall

&lt;span class="gu"&gt;## 13-03-2022&lt;/span&gt;

&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Had to go to the wall
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;He was starting to notice something changing in the weahter
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Using a small script I call &lt;code&gt;infrequent.py&lt;/code&gt;, I parse all the files in the &lt;code&gt;people&lt;/code&gt; folder. The script retrieves the person's details and dates of interactions. Using some datetime logic, I then identify the people I'm late reaching out to:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;interaction_dates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;last_interaction_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;interaction_dates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;next_interaction_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;last_interaction_date&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;time_delta&lt;/span&gt;

&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;relation&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - (Cadency: every &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;contact_frequency&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;delay_in_days&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; days passed since last contact)&amp;quot;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next_interaction_date&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;to_contact&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Once every person is processed, the same script sends me an email with the people I'm late reaching out to, using &lt;a href="https://aws.amazon.com/ses/"&gt;SES&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="how-i-use-it"&gt;How I use it&lt;/h2&gt;
&lt;p&gt;I store the &lt;code&gt;people&lt;/code&gt; folder and the &lt;code&gt;infrequent.py&lt;/code&gt; script in a GitHub repository - &lt;a href="https://github.com/duarteocarmo/tiny-crm-demo"&gt;similar to this one&lt;/a&gt;. Thanks to GitHub actions, the script runs every Friday morning. When it does, I receive an email with the names of people I'm late reaching out to.&lt;/p&gt;
&lt;p&gt;Every time I interact with someone in the folder, I update their file, and push the changes back to GitHub. It's simple, and works surprisingly well.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/duarteocarmo/tiny-crm-demo"&gt;Here's a copy of my repo&lt;/a&gt; in case you want to play around with the script. Feel free to &lt;a href="mailto:me@duarteocarmo.com"&gt;email me&lt;/a&gt; if you have problems getting it to run.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This post was inspired by &lt;a href="https://sive.rs/hundreds"&gt;this&lt;/a&gt; article&lt;/em&gt;&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Visualizing every job in the world</title><link href="https://duarteocarmo.com/blog/every-job-world.html" rel="alternate"/><published>2022-02-10T09:00:00+01:00</published><updated>2022-02-10T09:00:00+01:00</updated><author><name>Duarte O.Carmo</name></author><id>tag:duarteocarmo.com,2022-02-10:/blog/every-job-world.html</id><summary type="html">&lt;p&gt;&lt;a href="https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/duarteocarmo/esco-visualizations/master/projector_config.json"&gt;
&lt;img src="https://duarteocarmo.com/images/36/every-job-world.png" alt="Embeddings" style="max-width:100%;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Imagine you have to classify &lt;em&gt;every single job title&lt;/em&gt; in the world into 10 categories, how would you go about it?&lt;/p&gt;
&lt;p&gt;This is a  &lt;em&gt;fairly hard&lt;/em&gt; problem to solve. However, the European Union has actually taken it on. They named it: &lt;a href="https://ec.europa.eu/esco/portal/occupation"&gt;the ESCO project.&lt;/a&gt; (ESCO stands for European Skills, Competences …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;a href="https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/duarteocarmo/esco-visualizations/master/projector_config.json"&gt;
&lt;img src="https://duarteocarmo.com/images/36/every-job-world.png" alt="Embeddings" style="max-width:100%;"&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Imagine you have to classify &lt;em&gt;every single job title&lt;/em&gt; in the world into 10 categories, how would you go about it?&lt;/p&gt;
&lt;p&gt;This is a  &lt;em&gt;fairly hard&lt;/em&gt; problem to solve. However, the European Union has actually taken it on. They named it: &lt;a href="https://ec.europa.eu/esco/portal/occupation"&gt;the ESCO project.&lt;/a&gt; (ESCO stands for European Skills, Competences, Qualifications and Occupations)&lt;/p&gt;
&lt;p&gt;Damn I love the EU.&lt;/p&gt;
&lt;p&gt;What if we scraped their database of 3001 occupations and ran them through the famous &lt;a href="https://en.wikipedia.org/wiki/BERT_(language_model)"&gt;BERT&lt;/a&gt; model? We could look at all the &lt;a href="https://ec.europa.eu/esco/portal/occupation?uri=http%3A%2F%2Fdata.europa.eu%2Fesco%2Fisco%2FC821&amp;amp;conceptLanguage=en&amp;amp;full=true#&amp;amp;uri=http://data.europa.eu/esco/isco/C821"&gt;alternative names&lt;/a&gt; for each title/occupation, and build a network of jobs.&lt;/p&gt;
&lt;p&gt;The result is &lt;a href="https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/duarteocarmo/esco-visualizations/master/projector_config.json"&gt;this projection&lt;/a&gt;: a network of every job in the world, that clusters titles by similarity.&lt;/p&gt;
&lt;p&gt;You can also run the PCA or t-SNE algorithms through it. These will cluster the jobs in slightly different ways. Go ahead and look for your job title on the right pane, and see the most similar jobs out there (at least according to BERT).&lt;/p&gt;</content><category term="blog"/></entry></feed>