diff --git a/2014/01/19/kaggle-beginner-tips/index.html b/2014/01/19/kaggle-beginner-tips/index.html
index 09ff6f5d6..3e60da7f5 100644
--- a/2014/01/19/kaggle-beginner-tips/index.html
+++ b/2014/01/19/kaggle-beginner-tips/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,Kaggle,Kaggle beginners"><meta name=description content="First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Kaggle beginner tips"><meta property="og:description" content="First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-01-19T10:34:28+00:00"><meta property="article:modified_time" content="2023-07-06T09:28:02+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Kaggle beginner tips"><meta name=twitter:description content="First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Kaggle beginner tips","item":"https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Kaggle beginner tips","name":"Kaggle beginner tips","description":"First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions.","keywords":["data science","Kaggle","Kaggle beginners"],"articleBody":"These are few points from an email I sent to members of the Data Science Sydney Meetup. I suppose other Kaggle beginners may find it useful.\nMy first steps when working on a new competition are:\nRead all the instructions carefully to understand the problem. One important thing to look at is what measure is being optimised. For example, minimising the mean absolute error (MAE) may require a different approach from minimising the mean square error (MSE). Read messages on the forum. Especially when joining a competition late, you can learn a lot from the problems other people had. And sometimes there’s even code to get you started (though code quality may vary and it’s not worth relying on). Download the data and look at it a bit to understand it better, noting any insights you may have and things you would like to try. Even if you don’t know how to model something, knowing what you want to model is half of the solution. For example, in the DSG Hackathon (predicting air quality), we noticed that even though we had to produce hourly predictions for pollutant levels, the measured levels don’t change every hour (probably due to limitations in the measuring equipment). This led us to try a simple “model” for the first few hours, where we predicted exactly the last measured value, which proved to be one of our most valuable insights. Stupid and uninspiring, but we did finish 6th :-). The main message is: look at the data! Set up a local validation environment. This will allow you to iterate quickly without making submissions, and will increase the accuracy of your model. For those with some programming experience: local validation is your private development environment, the public leaderboard is staging, and the private leaderboard is production.\nWhat you use for local validation depends on the type of problem. For example, for classic prediction problems you may use one of the classic cross-validation techniques. For forecasting problems, you should try and have a local setup that is as close as possible to the setup of the leaderboard. In the Yandex competition, the leaderboard is based on data from the last three days of search activity. You should use a similar split for the training data (and of course, use exactly the same local setup for all the team members so you can compare results). Get the submission format right. Make sure that you can reproduce the baseline results locally. Now, the way things often work is:\nYou try many different approaches and ideas. Most of them lead to nothing. Hopefully some lead to something. Create ensembles of the various approaches. Repeat until you run out of time. Win. Hopefully. Note that in many competitions, the differences between the top results are not statistically significant, so winning may depend on luck. But getting one of the top results also depends to a large degree on your persistence. To avoid disappointment, I think the main goal should be to learn things, so spend time trying to understand how the methods that you’re using work. Libraries like sklearn make it really easy to try a bunch of models without understanding how they work, but you’re better off trying less things and developing the ability to reason about why they work or not work.\nAn analogy for programmers: while you can use an array, a linked list, a binary tree, and a hash table interchangeably in some situations, understanding when to use each one can make a world of difference in terms of performance. It’s pretty similar for predictive models (though they are often not as well-behaved as data structures).\nFinally, it’s worth watching this video by Phil Brierley, who won a bunch of Kaggle competitions. It’s really good, and doesn’t require much understanding of R.\nAny comments are welcome!\n","wordCount":"638","inLanguage":"en","datePublished":"2014-01-19T10:34:28Z","dateModified":"2023-07-06T09:28:02+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Kaggle beginner tips</h1><div class=post-meta><span title='2014-01-19 10:34:28 +0000 UTC'>January 19, 2014</span></div></header><div class=post-content><p>These are few points from an email I sent to members of the <a href=http://www.meetup.com/Data-Science-Sydney/ target=_blank rel=noopener>Data Science Sydney Meetup</a>. I suppose other Kaggle beginners may find it useful.</p><p>My first steps when working on a new competition are:</p><ul><li>Read all the instructions carefully to understand the problem. One important thing to look at is what measure is being optimised. For example, minimising the mean absolute error (MAE) may require a different approach from minimising the mean square error (MSE).</li><li>Read messages on the forum. Especially when joining a competition late, you can learn a lot from the problems other people had. And sometimes there&rsquo;s even code to get you started (though code quality may vary and it&rsquo;s not worth relying on).</li><li>Download the data and look at it a bit to understand it better, noting any insights you may have and things you would like to try. Even if you don&rsquo;t know <em>how</em> to model something, knowing <em>what</em> you want to model is half of the solution. For example, in the <a href=http://www.kaggle.com/c/dsg-hackathon target=_blank rel=noopener>DSG Hackathon</a> (predicting air quality), we noticed that even though we had to produce hourly predictions for pollutant levels, the measured levels don&rsquo;t change every hour (probably due to limitations in the measuring equipment). This led us to try a simple &ldquo;model&rdquo; for the first few hours, <a href=http://www.kaggle.com/c/dsg-hackathon/forums/t/1821/general-approaches-to-partitioning-the-models/10631#post10631 target=_blank rel=noopener>where we predicted exactly the last measured value</a>, which proved to be one of our most valuable insights. Stupid and uninspiring, but we did finish 6th :-). The main message is: look at the data!</li><li>Set up a local validation environment. This will allow you to iterate quickly without making submissions, and will increase the accuracy of your model. For those with some programming experience: local validation is your private development environment, the public leaderboard is staging, and the private leaderboard is production.<br>What you use for local validation depends on the type of problem. For example, for classic prediction problems you may use one of the classic <a href=https://en.wikipedia.org/wiki/Cross-validation_%28statistics%29 target=_blank rel=noopener>cross-validation techniques</a>. For forecasting problems, you should try and have a local setup that is as close as possible to the setup of the leaderboard. In the <a href=https://www.kaggle.com/c/yandex-personalized-web-search-challenge/>Yandex competition</a>, the leaderboard is based on data from the last three days of search activity. You should use a similar split for the training data (and of course, use exactly the same local setup for all the team members so you can compare results).</li><li>Get the submission format right. Make sure that you can reproduce the baseline results locally.</li></ul><p>Now, the way things often work is:</p><ul><li>You try many different approaches and ideas. Most of them lead to nothing. Hopefully some lead to something.</li><li>Create ensembles of the various approaches.</li><li>Repeat until you run out of time.</li><li>Win. Hopefully.</li></ul><p>Note that in many competitions, the differences between the top results are not statistically significant, so winning may depend on luck. But getting one of the top results also depends to a large degree on your persistence. To avoid disappointment, I think the main goal should be to learn things, so spend time trying to understand how the methods that you&rsquo;re using work. Libraries like sklearn make it really easy to try a bunch of models without understanding how they work, but you&rsquo;re better off trying less things and developing the ability to reason about why they work or not work.</p><p>An analogy for programmers: while you can use an array, a linked list, a binary tree, and a hash table interchangeably in some situations, understanding when to use each one can make a world of difference in terms of performance. It&rsquo;s pretty similar for predictive models (though they are often not as well-behaved as data structures).</p><p>Finally, it&rsquo;s worth watching <a href=http://anotherdataminingblog.blogspot.com.au/2013/10/techniques-to-improve-accuracy-of-your_17.html target=_blank rel=noopener>this video</a> by Phil Brierley, who won a bunch of Kaggle competitions. It&rsquo;s really good, and doesn&rsquo;t require much understanding of R.</p><p>Any comments are welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-beginners/>Kaggle Beginners</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on x" href="https://x.com/intent/tweet/?text=Kaggle%20beginner%20tips&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f&amp;hashtags=datascience%2cKaggle%2cKagglebeginners"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f&amp;title=Kaggle%20beginner%20tips&amp;summary=Kaggle%20beginner%20tips&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f&title=Kaggle%20beginner%20tips"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on whatsapp" href="https://api.whatsapp.com/send?text=Kaggle%20beginner%20tips%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on telegram" href="https://telegram.me/share/url?text=Kaggle%20beginner%20tips&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle beginner tips on ycombinator" href="https://news.ycombinator.com/submitlink?t=Kaggle%20beginner%20tips&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f01%2f19%2fkaggle-beginner-tips%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-75><div class=comment-header><a href=#comment-75><img class=comment-avatar src="https://www.gravatar.com/avatar/8e31214a8da081b4a48b0b2dfa012d84?s=50"><p class=comment-info><strong>Flavio</strong><br><small>2014-12-29 13:09:27</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir!</p><p>I have a question.</p><p>When you say: &ldquo;For example, minimising the mean absolute error (MAE) may require a different approach from minimising the mean square error (MSE).&rdquo; can you explain what kind of approach (or methods, or rules of thumb) that your get to minimising MAE or MSE in machine learning?</p><p>Thanks for your time in advance!</p><p>Regards,</p><p>Flavio</p></div></div><div class=comment-level-1 id=comment-76><div class=comment-header><a href=#comment-76><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir</strong><br><small>2014-12-29 21:50:13</small></p></a></div><div class="comment-body post-content"><p>Hi Flavio!</p><p>The optimisation approach depends on the data and method you&rsquo;re using.</p><p>A basic example is when you don&rsquo;t have any features, only a sample of target values. In that case, if you want to minimise the MAE you should choose the sample median, and if you want to minimise the MSE you should choose the sample mean. Here&rsquo;s proof why: <a href=https://www.dropbox.com/s/b1195thcqebnxyn/mae-vs-rmse.pdf target=_blank rel=noopener>https://www.dropbox.com/s/b1195thcqebnxyn/mae-vs-rmse.pdf</a></p><p>For more complex problems, if you&rsquo;re using a machine learning package you can often specify the type of loss function to minimise (see <a href=https://en.wikipedia.org/wiki/Loss_function#Selecting_a_loss_function%29 target=_blank rel=noopener>https://en.wikipedia.org/wiki/Loss_function#Selecting_a_loss_function)</a>. But even if your measure isn&rsquo;t directly optimised (e.g., MAE is harder to minimise than MSE because it&rsquo;s not differentiable at zero), you can always do cross-validation to find the parameters that optimise it.</p><p>I hope this helps.</p></div></div><div class=comment-level-0 id=comment-2995><div class=comment-header><a href=#comment-2995><img class=comment-avatar src="https://www.gravatar.com/avatar/6541344d2773f2fbdb1df6f6d0ca71f5?s=50"><p class=comment-info><strong>Hamid Khan</strong><br><small>2018-10-25 21:18:00</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir!</p><p>appreciate your work! I need to know should I directly jump into machine learning algorithm, programming etc or to first master math and statistics ?
 I am new in this field.</p></div></div><div class=comment-level-1 id=comment-2997><div class=comment-header><a href=#comment-2997><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2018-10-26 02:27:48</small></p></a></div><div class="comment-body post-content">Thanks Hamid! Here&rsquo;s a post I wrote on the topic: <a href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/>https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/</a></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2014/08/17/datas-hierarchy-of-needs/index.html b/2014/08/17/datas-hierarchy-of-needs/index.html
index 05fed9ecd..67bd0ba2e 100644
--- a/2014/08/17/datas-hierarchy-of-needs/index.html
+++ b/2014/08/17/datas-hierarchy-of-needs/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/datas-hierarchy-of-needs.jpg 794w," src=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/datas-hierarchy-of-needs.jpg alt="Data's hierarchy of needs" loading=lazy></a></figure><p>In addition, before starting to build a data pipeline, one needs to ensure that the tracked system works as expected. For example, a buggy website is likely to produce weird metrics, which in turn would make the data processing, reporting and predictions unreliable. I completely agree with Jay&rsquo;s point about needing to get the basis of the pyramid right before setting out to do &ldquo;something with data&rdquo; (which seems to be the desire of every company nowadays).</p><p>The general point is that it&rsquo;s important to have realistic expectations about what can be obtained by data-driven algorithms and insights. These can only be as good as the underlying data, with the results always depending to a large degree on having a solid infrastructure. Not everything has to be perfect from the start (most things never will be), but some degree of robustness is required to avoid spending too many resources on things that would never work. Trying to apply the latest predictive models without a reliable data infrastructure is like driving a fancy car on broken roads – you&rsquo;re unlikely to get very far.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-business/>Data Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on x" href="https://x.com/intent/tweet/?text=Data%e2%80%99s%20hierarchy%20of%20needs&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f&amp;hashtags=business%2cdatabusiness%2cdatascience"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f&amp;title=Data%e2%80%99s%20hierarchy%20of%20needs&amp;summary=Data%e2%80%99s%20hierarchy%20of%20needs&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f&title=Data%e2%80%99s%20hierarchy%20of%20needs"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on whatsapp" href="https://api.whatsapp.com/send?text=Data%e2%80%99s%20hierarchy%20of%20needs%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on telegram" href="https://telegram.me/share/url?text=Data%e2%80%99s%20hierarchy%20of%20needs&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Data’s hierarchy of needs on ycombinator" href="https://news.ycombinator.com/submitlink?t=Data%e2%80%99s%20hierarchy%20of%20needs&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f17%2fdatas-hierarchy-of-needs%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/08/24/how-to-almost-win-kaggle-competitions/index.html b/2014/08/24/how-to-almost-win-kaggle-competitions/index.html
index 76ea34b07..758d27a57 100644
--- a/2014/08/24/how-to-almost-win-kaggle-competitions/index.html
+++ b/2014/08/24/how-to-almost-win-kaggle-competitions/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,Kaggle,Kaggle beginners,Kaggle competition,predictive modelling"><meta name=description content="Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="How to (almost) win Kaggle competitions"><meta property="og:description" content="Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-08-24T12:40:53+00:00"><meta property="article:modified_time" content="2023-07-06T09:28:02+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="How to (almost) win Kaggle competitions"><meta name=twitter:description content="Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"How to (almost) win Kaggle competitions","item":"https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"How to (almost) win Kaggle competitions","name":"How to (almost) win Kaggle competitions","description":"Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions.","keywords":["data science","Kaggle","Kaggle beginners","Kaggle competition","predictive modelling"],"articleBody":"Last week, I gave a talk at the Data Science Sydney Meetup group about some of the lessons I learned through almost winning five Kaggle competitions. The core of the talk was ten tips, which I think are worth putting in a post (the original slides are here). Some of these tips were covered in my beginner tips post from a few months ago. Similar advice was also recently published on the Kaggle blog – it’s great to see that my tips are in line with the thoughts of other prolific kagglers.\nTip 1: RTFM It’s surprising to see how many people miss out on important details, such as remembering the final date to make the first submission. Before jumping into building models, it’s important to understand the competition timeline, be able to reproduce benchmarks, generate the correct submission format, etc.\nTip 2: Know your measure A key part of doing well in a competition is understanding how the measure works. It’s often easy to obtain significant improvements in your score by using an optimisation approach that is suitable to the measure. A classic example is optimising the mean absolute error (MAE) versus the mean square error (MSE). It’s easy to show that given no other data for a set of numbers, the predictor that minimises the MAE is the median, while the predictor that minimises the MSE is the mean. Indeed, in the EMC Data Science Hackathon we fell back to the median rather than the mean when there wasn’t enough data, and that ended up working pretty well.\nTip 3: Know your data In Kaggle competitions, overspecialisation (without overfitting) is a good thing. This is unlike academic machine learning papers, where researchers often test their proposed method on many different datasets. This is also unlike more applied work, where you may care about data drifting and whether what you predict actually makes sense. Examples include the Hackathon, where the measures of pollutants in the air were repeated for consecutive hours (i.e., they weren’t really measured); the multi-label Greek article competition, where I found connected components of labels (doesn’t generalise well to other datasets); and the Arabic writers competition, where I used histogram kernels to deal with the features that we were given. The general lesson is that custom solutions win, and that’s why the world needs data scientists (at least until we are replaced by robots).\nTip 4: What before how It’s important to know what you want to model before figuring out how to model it. It seems like many beginners tend to worry too much about which tool to use (Python or R? Logistic regression or SVMs?), when they should be worrying about understanding the data and what useful patterns they want to capture. For example, when we worked on the Yandex search personalisation competition, we spent a lot of time looking at the data and thinking what makes sense for users to be doing. In that case it was easy to come up with ideas, because we all use search engines. But the main message is that to be effective, you have to become one with the data.\nTip 5: Do local validation This is a point I covered in my Kaggle beginner tips post. Having a local validation environment allows you to move faster and produce more reliable results than when relying on the leaderboard. The main scenarios when you should skip local validation is when the data is too small (a problem I had in the Arabic writers competition), or when you run out of time (towards the end of the competition).\nTip 6: Make fewer submissions In addition to making you look good, making few submissions reduces the likelihood of overfitting the leaderboard, which is a real problem. If your local validation is set up well and is consistent with the leaderboard (which you need to test by making one or two submissions), there’s really no need to make many submissions. Further, if you’re doing well, making submissions erodes your competitive advantage by showing your competitors what scores are obtainable and motivating them to work harder. Just resist the urge to submit, unless you have a really good reason.\nTip 7: Do your research For any given problem, it’s likely that there are people dedicating their lives to its solution. These people (often academics) have probably published papers, benchmarks and code, which you can learn from. Unlike actually winning, which is not only dependent on you, gaining deeper knowledge and understanding is the only sure reward of a competition. This has worked well for me, as I’ve learned something new and applied it successfully in nearly every competition I’ve worked on.\nTip 8: Apply the basics rigorously While playing with obscure methods can be a lot of fun, it’s often the case that the basics will get you very far. Common algorithms have good implementations in most major languages, so there’s really no reason not to try them. However, note that when you do try any methods, you must do some minimal tuning of the main parameters (e.g., number of trees in a random forest or the regularisation of a linear model). Running a method without minimal tuning is worse than not running it at all, because you may get a false negative – giving up on a method that actually works very well.\nAn example of applying the basics rigorously is in the classic paper In defense of one-vs-all classification, where the authors showed that the simple one-vs-all (OVA) approach to multiclass classification is at least as good as approaches that are much more sophisticated. In their words: “What we find is that although a wide array of more sophisticated methods for multiclass classification exist, experimental evidence of the superiority of these methods over a simple OVA scheme is either lacking or improperly controlled or measured”. If such a failure to perform proper experiments can happen to serious machine learning researchers, it can definitely happen to the average kaggler. Don’t let it happen to you.\nTip 9: The forum is your friend It’s very important to subscribe to the forum to receive notifications on issues with the data or the competition. In addition, it’s worth trying to figure out what your competitors are doing. An extreme example is the recent trend of code sharing during the competition (which I don’t really like) – while it’s not a good idea to rely on such code, it’s important to be aware of its existence. Finally, reading the post-competition summaries on the forum is a valuable way of learning from the winners and improving over time.\nTip 10: Ensemble all the things Not to be confused with ensemble methods (which are also very important), the idea here is to combine models that were developed independently. In high-profile competitions, it is often the case that teams merge and gain a significant boost from combining their models. This is worth doing even when competing alone, because almost no competition is won by a single model.\n","wordCount":"1169","inLanguage":"en","datePublished":"2014-08-24T12:40:53Z","dateModified":"2023-07-06T09:28:02+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">How to (almost) win Kaggle competitions</h1><div class=post-meta><span title='2014-08-24 12:40:53 +0000 UTC'>August 24, 2014</span></div></header><div class=post-content><p>Last week, I gave a talk at the <a href=http://www.meetup.com/Data-Science-Sydney/ target=_blank rel=noopener>Data Science Sydney Meetup group</a> about some of the lessons I learned through almost winning five Kaggle competitions. The core of the talk was ten tips, which I think are worth putting in a post (the original slides are <a href=http://yanirs.github.io/talks/data-science-sydney-winning-kaggle/ target=_blank rel=noopener>here</a>). Some of these tips were covered in my <a href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/>beginner tips post</a> from a few months ago. Similar advice was also <a href=http://blog.kaggle.com/2014/08/01/learning-from-the-best/ target=_blank rel=noopener>recently published on the Kaggle blog</a> – it&rsquo;s great to see that my tips are in line with the thoughts of other prolific kagglers.</p><h3 id=tip-1-rtfm>Tip 1: RTFM<a hidden class=anchor aria-hidden=true href=#tip-1-rtfm>#</a></h3><p>It&rsquo;s surprising to see how many people miss out on important details, such as remembering the final date to make the first submission. Before jumping into building models, it&rsquo;s important to understand the competition timeline, be able to reproduce benchmarks, generate the correct submission format, etc.</p><h3 id=tip-2-know-your-measure>Tip 2: Know your measure<a hidden class=anchor aria-hidden=true href=#tip-2-know-your-measure>#</a></h3><p>A key part of doing well in a competition is understanding how the measure works. It&rsquo;s often easy to obtain significant improvements in your score by using an optimisation approach that is suitable to the measure. A classic example is optimising the mean absolute error (MAE) versus the mean square error (MSE). It&rsquo;s easy to show that given no other data for a set of numbers, the predictor that minimises the MAE is the median, while the predictor that minimises the MSE is the mean. Indeed, in the <a href=https://www.kaggle.com/c/dsg-hackathon/forums/t/1821/general-approaches-to-partitioning-the-models/10631#post10631 target=_blank rel=noopener>EMC Data Science Hackathon</a> we fell back to the median rather than the mean when there wasn&rsquo;t enough data, and that ended up working pretty well.</p><h3 id=tip-3-know-your-data>Tip 3: Know your data<a hidden class=anchor aria-hidden=true href=#tip-3-know-your-data>#</a></h3><p>In Kaggle competitions, overspecialisation (without overfitting) is a good thing. This is unlike academic machine learning papers, where researchers often test their proposed method on many different datasets. This is also unlike more applied work, where you may care about data drifting and whether what you predict actually makes sense. Examples include the <a href=https://www.kaggle.com/c/dsg-hackathon/forums/t/1821/general-approaches-to-partitioning-the-models/10631#post10631 target=_blank rel=noopener>Hackathon</a>, where the measures of pollutants in the air were repeated for consecutive hours (i.e., they weren&rsquo;t really measured); the <a title="Greek Media Monitoring Kaggle competition: My approach" href=https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/ target=_blank rel=noopener>multi-label Greek article competition</a>, where I found connected components of labels (doesn&rsquo;t generalise well to other datasets); and the <a href=http://blog.kaggle.com/2012/04/29/on-diffusion-kernels-histograms-and-arabic-writer-identification/ target=_blank rel=noopener>Arabic writers competition</a>, where I used histogram kernels to deal with the features that we were given. The general lesson is that custom solutions win, and that&rsquo;s why the world needs data scientists (at least <a href=http://www.datarobot.com/ target=_blank rel=noopener>until we are replaced by robots</a>).</p><h3 id=tip-4-what-before-how>Tip 4: What before how<a hidden class=anchor aria-hidden=true href=#tip-4-what-before-how>#</a></h3><p>It&rsquo;s important to know <em>what</em> you want to model before figuring out <em>how</em> to model it. It seems like many beginners tend to worry too much about which tool to use (Python or R? Logistic regression or SVMs?), when they should be worrying about understanding the data and what useful patterns they want to capture. For example, when we worked on the <a href=https://www.kaggle.com/c/yandex-personalized-web-search-challenge/forums/t/6811/share-your-approach/37306#post37306 target=_blank rel=noopener>Yandex search personalisation competition</a>, we spent a lot of time looking at the data and thinking what makes sense for users to be doing. In that case it was easy to come up with ideas, because we all use search engines. But the main message is that to be effective, you have to become one with the data.</p><h3 id=tip-5-do-local-validation>Tip 5: Do local validation<a hidden class=anchor aria-hidden=true href=#tip-5-do-local-validation>#</a></h3><p>This is a point I covered in my <a href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/#validation>Kaggle beginner tips post</a>. Having a local validation environment allows you to move faster and produce more reliable results than when relying on the leaderboard. The main scenarios when you should skip local validation is when the data is too small (a problem I had in the <a href=http://blog.kaggle.com/2012/04/29/on-diffusion-kernels-histograms-and-arabic-writer-identification/ target=_blank rel=noopener>Arabic writers competition</a>), or when you run out of time (towards the end of the competition).</p><h3 id=tip-6-make-fewer-submissions>Tip 6: Make fewer submissions<a hidden class=anchor aria-hidden=true href=#tip-6-make-fewer-submissions>#</a></h3><p>In addition to making you look good, making few submissions reduces the likelihood of overfitting the leaderboard, which is a real problem. If your local validation is set up well and is consistent with the leaderboard (which you need to test by making one or two submissions), there&rsquo;s really no need to make many submissions. Further, if you&rsquo;re doing well, making submissions erodes your competitive advantage by showing your competitors what scores are obtainable and motivating them to work harder. Just resist the urge to submit, unless you have a really good reason.</p><h3 id=tip-7-do-your-research>Tip 7: Do your research<a hidden class=anchor aria-hidden=true href=#tip-7-do-your-research>#</a></h3><p>For any given problem, it&rsquo;s likely that there are people dedicating their lives to its solution. These people (often academics) have probably published papers, benchmarks and code, which you can learn from. Unlike actually winning, which is not only dependent on you, gaining deeper knowledge and understanding is the only sure reward of a competition. This has worked well for me, as I&rsquo;ve learned something new and applied it successfully in <a href=https://yanirseroussi.com/2014/04/05/kaggle-competition-summaries/>nearly every competition I&rsquo;ve worked on</a>.</p><h3 id=tip-8-apply-the-basics-rigorously>Tip 8: Apply the basics rigorously<a hidden class=anchor aria-hidden=true href=#tip-8-apply-the-basics-rigorously>#</a></h3><p>While playing with obscure methods can be a lot of fun, it&rsquo;s often the case that the basics will get you very far. Common algorithms have good implementations in most major languages, so there&rsquo;s really no reason not to try them. However, note that when you do try any methods, you <em>must</em> do some minimal tuning of the main parameters (e.g., number of trees in a random forest or the regularisation of a linear model). <strong>Running a method without minimal tuning is worse than not running it at all</strong>, because you may get a false negative – giving up on a method that actually works very well.</p><p>An example of applying the basics rigorously is in the classic paper <a href=http://jmlr.org/papers/volume5/rifkin04a/rifkin04a.pdf target=_blank rel=noopener>In defense of one-vs-all classification</a>, where the authors showed that the simple one-vs-all (OVA) approach to multiclass classification is at least as good as approaches that are much more sophisticated. In their words: &ldquo;What we find is that although a wide array of more sophisticated methods for multiclass classification exist, experimental evidence of the superiority of these methods over a simple OVA scheme is either lacking or improperly controlled or measured&rdquo;. If such a failure to perform proper experiments can happen to serious machine learning researchers, it can definitely happen to the average kaggler. Don&rsquo;t let it happen to you.</p><h3 id=tip-9-the-forum-is-your-friend>Tip 9: The forum is your friend<a hidden class=anchor aria-hidden=true href=#tip-9-the-forum-is-your-friend>#</a></h3><p>It&rsquo;s very important to subscribe to the forum to receive notifications on issues with the data or the competition. In addition, it&rsquo;s worth trying to figure out what your competitors are doing. An extreme example is the recent trend of code sharing during the competition (<a href=http://www.kaggle.com/forums/t/5681/fed-up-with-beating-benchmark-code/30787#post30787 target=_blank rel=noopener>which I don&rsquo;t really like</a>) – while it&rsquo;s not a good idea to rely on such code, it&rsquo;s important to be aware of its existence. Finally, reading the post-competition summaries on the forum is a valuable way of learning from the winners and improving over time.</p><h3 id=tip-10-ensemble-all-the-things>Tip 10: Ensemble all the things<a hidden class=anchor aria-hidden=true href=#tip-10-ensemble-all-the-things>#</a></h3><p>Not to be confused with ensemble methods (which are also very important), the idea here is to combine models that were developed independently. In high-profile competitions, it is often the case that teams merge and gain a significant boost from combining their models. This is worth doing even when competing alone, because almost no competition is won by a single model.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-beginners/>Kaggle Beginners</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on x" href="https://x.com/intent/tweet/?text=How%20to%20%28almost%29%20win%20Kaggle%20competitions&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f&amp;hashtags=datascience%2cKaggle%2cKagglebeginners%2cKagglecompetition%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f&amp;title=How%20to%20%28almost%29%20win%20Kaggle%20competitions&amp;summary=How%20to%20%28almost%29%20win%20Kaggle%20competitions&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f&title=How%20to%20%28almost%29%20win%20Kaggle%20competitions"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on whatsapp" href="https://api.whatsapp.com/send?text=How%20to%20%28almost%29%20win%20Kaggle%20competitions%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on telegram" href="https://telegram.me/share/url?text=How%20to%20%28almost%29%20win%20Kaggle%20competitions&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to (almost) win Kaggle competitions on ycombinator" href="https://news.ycombinator.com/submitlink?t=How%20to%20%28almost%29%20win%20Kaggle%20competitions&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f24%2fhow-to-almost-win-kaggle-competitions%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-17><div class=comment-header><a href=#comment-17><img class=comment-avatar src="https://www.gravatar.com/avatar/329369dea5acae30d805ae2229ee965a?s=50"><p class=comment-info><strong>Toby</strong><br><small>2014-10-08 14:28:21</small></p></a></div><div class="comment-body post-content">Can you elaborate what you mean in Tip 5 by stating &ldquo;The main scenarios when you should skip local validation is when the data is too small &mldr;&rdquo;. What I experienced is that with too little observations, the leaderboard becomes very misleading, so my intuition would be to use more local validation for small datasets, not less.</div></div><div class=comment-level-1 id=comment-18><div class=comment-header><a href=#comment-18><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>yanirseroussi</strong><br><small>2014-10-08 21:04:52</small></p></a></div><div class="comment-body post-content"><p>Good point. What I was referring to are scenarios where local validation is unreliable.</p><p>For example, in the Arabic writer identification competition (<a href=http://blog.kaggle.com/2012/04/29/on-diffusion-kernels-histograms-and-arabic-writer-identification/%29 target=_blank rel=noopener>http://blog.kaggle.com/2012/04/29/on-diffusion-kernels-histograms-and-arabic-writer-identification/)</a>, each of the 204 writers had only two training paragraphs (all containing the same text), while the test/leaderboard instances were a third paragraph with different content. I tried many forms of local validation but none of them yielded results that were consistent with the leaderboard, so I ended up relying on the leaderboard score.</p></div></div><div class=comment-level-2 id=comment-19><div class=comment-header><a href=#comment-19><img class=comment-avatar src="https://www.gravatar.com/avatar/329369dea5acae30d805ae2229ee965a?s=50"><p class=comment-info><strong>Toby</strong><br><small>2014-10-09 06:35:46</small></p></a></div><div class="comment-body post-content">Ah, thanks, that clarifies what you meant. The (currently still running) Africa Soil Property contest (<a href=https://www.kaggle.com/c/afsis-soil-properties target=_blank rel=noopener>https://www.kaggle.com/c/afsis-soil-properties</a>) seems a bit similar. I won&rsquo;t put much more energy into that contest, but I am curious how it will work out in the end, and what things will have worked for the winners (maybe not much except pure luck).</div></div><div class=comment-level-0 id=comment-302><div class=comment-header><a href=#comment-302><img class=comment-avatar src="https://www.gravatar.com/avatar/2f58afbef02de0937224bcbce7e172c2?s=50"><p class=comment-info><strong>saatvik</strong><br><small>2015-04-11 10:08:23</small></p></a></div><div class="comment-body post-content">Could you provide some tips on #3(&lsquo;Getting to Know your data&rsquo;) with respect to best practice visualisations to gain insights from data - especially considering the fact that datasets always have a large number of features. Plotting feature vs. label graphs do seem to be helpful, but for a large number of features will be impractical. So how should one go about data analysis via visualisation?</div></div><div class=comment-level-1 id=comment-306><div class=comment-header><a href=#comment-306><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-04-13 08:29:37</small></p></a></div><div class="comment-body post-content"><p>It really depends on the dataset. For personal use, I don&rsquo;t worry too much about pretty visualisations. Often just printing some summary statistics works well.</p><p>Most text classification problems are hard to visualise. If, for example, you use bag of words (or n-grams) as your feature set, you could just print the top words for each label, or the top words that vary between labels. Another thing to look at would be commonalities between misclassified instances &ndash; these could be dependent on the content of the texts or their length.</p><p>Examples:</p><ul><li>In the Greek Media Monitoring competition (<a href=http://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/%29>http://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/)</a>, I found that &lsquo;Despite being manually annotated, the data isn’t very clean. Issues include identical texts that have different labels, empty articles, and articles with very few words. For example, the training set includes ten “articles” with a single word. Five of these articles have the word 68839, but each of these five was given a different label.&rsquo; &ndash; this was discovered by just printing some summary statistics and looking at misclassified instances</li><li>Looking into the raw data behind one of the widely-used sentiment analysis datasets, I found an issue that was overlooked by many other people who used the dataset: <a href=http://www.cs.cornell.edu/people/pabo/movie-review-data/ target=_blank rel=noopener>http://www.cs.cornell.edu/people/pabo/movie-review-data/</a> (look for the comment with my name &ndash; found four years after the original dataset was published)</li></ul><p>I hope this helps.</p></div></div><div class=comment-level-2 id=comment-384><div class=comment-header><a href=#comment-384><img class=comment-avatar src="https://www.gravatar.com/avatar/2f58afbef02de0937224bcbce7e172c2?s=50"><p class=comment-info><strong>saatvik</strong><br><small>2015-05-22 03:11:29</small></p></a></div><div class="comment-body post-content"><p>Thanks a lot,
 So to summarize the following could be 3</p><ol><li>Using summary statistics such as means/stds/variance on the data, looking out for outliers,etc in the data</li><li>Looking at misclassified instances during validation to find some sort of pattern in them</li><li>Looking at label-specific raw data
 I apologize for the long overdue response, and thanks for these tips. This will surely be useful in my next Kaggle competition.</li></ol></div></div><div class=comment-level-0 id=comment-383><div class=comment-header><a href=#comment-383><img class=comment-avatar src="https://www.gravatar.com/avatar/8b8af9734b92baf94e8022f01d19479e?s=50"><p class=comment-info><strong>smirshekari</strong><br><small>2015-05-22 01:44:51</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=https://smirshekari.wordpress.com/2015/05/21/how-to-almost-win-kaggle-competitions/ rel=nofollow>Dr. Manhattan's Diary</a> and commented:
diff --git a/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/index.html b/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/index.html
index 9f1641aea..7a108d329 100644
--- a/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/index.html
+++ b/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Bandcamp,BCRecommender,music,music industry,recommender systems"><meta name=description content="My motivation behind building BCRecommender, a free recommendation & discovery service for Bandcamp music."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Building a Bandcamp recommender system (part 1 – motivation)"><meta property="og:description" content="My motivation behind building BCRecommender, a free recommendation & discovery service for Bandcamp music."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/"><meta property="og:image" content="https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-08-30T08:11:38+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot.png"><meta name=twitter:title content="Building a Bandcamp recommender system (part 1 – motivation)"><meta name=twitter:description content="My motivation behind building BCRecommender, a free recommendation & discovery service for Bandcamp music."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Building a Bandcamp recommender system (part 1 – motivation)","item":"https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Building a Bandcamp recommender system (part 1 – motivation)","name":"Building a Bandcamp recommender system (part 1 – motivation)","description":"My motivation behind building BCRecommender, a free recommendation \u0026amp; discovery service for Bandcamp music.","keywords":["Bandcamp","BCRecommender","music","music industry","recommender systems"],"articleBody":"I’ve been a Bandcamp user for a few years now. I love the fact that they pay out a significant share of the revenue directly to the artists, unlike other services. In addition, despite the fact that fans may stream all the music for free and even easily rip it, almost $80M were paid out to artists through Bandcamp to date (including almost $3M in the last month) – serving as strong evidence that the traditional music industry’s fight against piracy is a waste of resources and time.\nOne thing I’ve been struggling with since starting to use Bandcamp is the discovery of new music. Originally (in 2011), I used the browse-by-tag feature, but it is often too broad to find music that I like. A newer feature is the Discoverinator, which is meant to emulate the experience of browsing through covers at a record store – sadly, I could never find much stuff I liked using that method. Last year, Bandcamp announced Bandcamp for fans, which includes the ability to wishlist items and discover new music by stalking/following other fans. In addition, they released a mobile app, which made the music purchased on Bandcamp much easier to access.\nAll these new features definitely increased my engagement and helped me find more stuff to listen to, but I still feel that Bandcamp music discovery could be much better. Specifically, I would love to be served personalised recommendations and be able to browse music that is similar to specific tracks and albums that I like. Rather than waiting for Bandcamp to implement these features, I decided to do it myself. Visit BCRecommender – Bandcamp recommendations based on your fan account to see where this effort stands at the moment.\nWhile BCRecommender has already helped me discover new music to add to my collection, building it gave me many more ideas on how it can be improved, so it’s definitely a work in progress. I’ll probably tinker with the underlying algorithms as I go, so recommendations may occasionally seem weird (but this always seems to be the case with recommender systems in the real world). In subsequent posts I’ll discuss some of the technical details and where I’d like to take this project.\nIt’s probably worth noting that BCRecommender is not associated with or endorsed by Bandcamp, but I doubt they would mind since it was built using publicly-available information, and is full of links to buy the music back on their site.\n","wordCount":"411","inLanguage":"en","image":"https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot.png","datePublished":"2014-08-30T08:11:38Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Building a Bandcamp recommender system (part 1 – motivation)</h1><div class=post-meta><span title='2014-08-30 08:11:38 +0000 UTC'>August 30, 2014</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot_hu0bc6edb14393435331a10ae51f90dbe8_731004_360x0_resize_box_3.png 360w ,https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot_hu0bc6edb14393435331a10ae51f90dbe8_731004_480x0_resize_box_3.png 480w ,https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot_hu0bc6edb14393435331a10ae51f90dbe8_731004_720x0_resize_box_3.png 720w ,https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot_hu0bc6edb14393435331a10ae51f90dbe8_731004_1080x0_resize_box_3.png 1080w ,https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot_hu0bc6edb14393435331a10ae51f90dbe8_731004_1500x0_resize_box_3.png 1500w ,https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot.png 1581w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/bcrecommender-screenshot.png alt width=1581 height=821></figure><div class=post-content><p>I&rsquo;ve been a <a href=http://bandcamp.com target=_blank rel=noopener>Bandcamp</a> user for a few years now. I love the fact that they pay out a <a href=https://bandcamp.com/pricing target=_blank rel=noopener>significant share of the revenue</a> directly to the artists, unlike <a href=https://en.wikipedia.org/wiki/Spotify#Criticism target=_blank rel=noopener>other services</a>. In addition, despite the fact that fans may stream all the music for free and even <a href=https://bandcamp.com/help/audio_basics#steal target=_blank rel=noopener>easily rip it</a>, almost $80M were paid out to artists through Bandcamp to date (including almost $3M in the last month) – serving as strong evidence that the traditional music industry&rsquo;s fight against piracy is a waste of resources and time.</p><p>One thing I&rsquo;ve been struggling with since starting to use Bandcamp is the discovery of new music. Originally (in 2011), I used the <a href=https://bandcamp.com/tags target=_blank rel=noopener>browse-by-tag</a> feature, but it is often too broad to find music that I like. A newer feature is the <a href=https://bandcamp.com/discover target=_blank rel=noopener>Discoverinator</a>, which is meant to emulate the experience of <a href=http://blog.bandcamp.com/2012/06/07/behold-the-glory-of-the-discoverinator/ target=_blank rel=noopener>browsing through covers at a record store</a> – sadly, I could never find much stuff I liked using that method. Last year, Bandcamp announced <a href=http://blog.bandcamp.com/2013/01/10/bandcamp-for-fans/ target=_blank rel=noopener>Bandcamp for fans</a>, which includes the ability to wishlist items and discover new music by stalking/following other fans. In addition, they released a <a href=http://blog.bandcamp.com/2013/10/25/its-over/ target=_blank rel=noopener>mobile app</a>, which made the music purchased on Bandcamp much easier to access.</p><p>All these new features definitely increased my engagement and helped me find more stuff to listen to, but I still feel that Bandcamp music discovery could be much better. Specifically, I would love to be served personalised recommendations and be able to browse music that is similar to specific tracks and albums that I like. Rather than waiting for Bandcamp to implement these features, I decided to do it myself. Visit <a href=http://www.bcrecommender.com target=_blank rel=noopener>BCRecommender – Bandcamp recommendations based on your fan account</a> to see where this effort stands at the moment.</p><p>While BCRecommender has already helped me discover new music to add to <a href=https://bandcamp.com/yanir target=_blank rel=noopener>my collection</a>, building it gave me many more ideas on how it can be improved, so it&rsquo;s definitely a work in progress. I&rsquo;ll probably tinker with the underlying algorithms as I go, so recommendations may occasionally seem weird (but this always seems to be the case with recommender systems in the real world). In subsequent posts I&rsquo;ll discuss some of the technical details and where I&rsquo;d like to take this project.</p><p><small><br>It&rsquo;s probably worth noting that BCRecommender is not associated with or endorsed by Bandcamp, but I doubt they would mind since it was built using publicly-available information, and is full of links to buy the music back on their site.<br></small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/music/>Music</a></li><li><a href=https://yanirseroussi.com/tags/music-industry/>Music Industry</a></li><li><a href=https://yanirseroussi.com/tags/recommender-systems/>Recommender Systems</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on x" href="https://x.com/intent/tweet/?text=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f&amp;hashtags=Bandcamp%2cBCRecommender%2cmusic%2cmusicindustry%2crecommendersystems"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f&amp;title=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29&amp;summary=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f&title=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on whatsapp" href="https://api.whatsapp.com/send?text=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on telegram" href="https://telegram.me/share/url?text=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a Bandcamp recommender system (part 1 – motivation) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Building%20a%20Bandcamp%20recommender%20system%20%28part%201%20%e2%80%93%20motivation%29&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f08%2f30%2fbuilding-a-bandcamp-recommender-system-part-1-motivation%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-6240><div class=comment-header><a href=#comment-6240><img class=comment-avatar src="https://www.gravatar.com/avatar/fa3df43fa529c8416bfe8b8831fe6bd0?s=50"><p class=comment-info><strong>Clément</strong><br><small>2019-02-08 15:43:50</small></p></a></div><div class="comment-body post-content"><p>Hi!</p><p>I just found these articles a few years after their publication&mldr;
 I saw that the BCRecommender seems not active anymore and that the last post is from 2015.</p><p>Any update?
 I&rsquo;m interested to have your feedback.</p><p>Thanks,</p><p>Clément</p></div></div><div class=comment-level-1 id=comment-6287><div class=comment-header><a href=#comment-6287><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2019-02-08 22:14:39</small></p></a></div><div class="comment-body post-content">Hi Clément, there&rsquo;s an update here: <a href=https://yanirseroussi.com/state-of-bandcamp-recommender-september-2017/>https://yanirseroussi.com/state-of-bandcamp-recommender-september-2017/</a></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
diff --git a/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/index.html b/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/index.html
index ec342c531..104a3c474 100644
--- a/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/index.html
+++ b/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Bandcamp,BCRecommender,DevOps,recommender systems,software engineering"><meta name=description content="Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)"><meta property="og:description" content="Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/"><meta property="og:image" content="https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-09-07T10:48:44+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture.png"><meta name=twitter:title content="Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)"><meta name=twitter:description content="Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)","item":"https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)","name":"Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)","description":"Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service.","keywords":["Bandcamp","BCRecommender","DevOps","recommender systems","software engineering"],"articleBody":"This is the second part of a series of posts on my BCRecommender – personalised Bandcamp recommendations project. Check out the first part for the general motivation behind this project.\nBCRecommender is a hobby project whose main goal is to help me find music I like on Bandcamp. Its secondary goal is to serve as a testing ground for ideas I have and things I’d like to explore.\nOne question I’ve been wondering about is: how much money does one need to spend on infrastructure for a simple web-based product before it reaches meaningful traffic?\nThe answer is: not much at all. It can easily be done for less than $1 per month.\nThis post discusses my exploration of this question by describing the main components of the BCRecommender system, without getting into the algorithms that drive it (which will be covered in subsequent posts).\nThe general flow of BCRecommender is fairly simple: crawl publicly-available data from Bandcamp (fan collections and tracks/albums = tralbums), generate recommendations based on this data (static lists of tralbums indexed by fan for personalised recommendations and by tralbum for similarity), and present the recommendations to users in a way that’s easy to browse and explore (since we’re dealing with music it must be playable, which is easy to achieve by embedding Bandcamp’s iframes).\nFirst iteration: Django \u0026 AWS The first iteration of the project was implemented as a Django project. Having never built a Django project from scratch, I figured this would be a good way to learn how it’s done properly. One thing I was keen on learning was using the Django ORM with an SQL database (in the past I’ve worked with Django and MongoDB). This ended up working less smoothly than I expected, perhaps because I’m too used to MongoDB, or because SQL forces you to model your data in unnatural ways, or because I insisted on using SQLite for simplicity. Whatever it was, I quickly started missing MongoDB, despite its flaws.\nI chose AWS for hosting because my personal account was under the free tier, and using a micro instance is more than enough for serving a website with no traffic. I considered Google App Engine with its indefinite free tier, but after reading the docs I realised I don’t want to jump through so many hoops to use their system – Google’s free tier was likely to cost too much in pain and time.\nWhile an AWS micro instance is enough for serving the recommendations, it’s not enough for generating them. Rather than paying Amazon for another instance, I figured that using spare capacity on my own laptop (quad-core with 16GB of RAM) would be good enough. So the backend worker for BCRecommender ended up being a local virtual machine using one core and 4GB of RAM.\nAfter some coding I had a nice setup in place:\nAWS webserver running Django with SQLite as the database layer and a simple frontend, styled with Bootstrap Local backend worker running Celery under Supervisor to collect the data (with errors reported to a dedicated Gmail account), Dropbox for backups, and Django management commands to generate the recommendations Code and issue tracker hosted on Bitbucket (which provides free private repositories) Fabric scripts for deployments to the AWS webserver and the local backend worker (including database sync as one big SQLite file) Local virtual machine for development (provisioned with Vagrant) This system wasn’t going to scale, but I didn’t care. I just used it to discover new music, and it worked. I didn’t even bother registering a domain name, so it was all running for free.\nSecond iteration: “Django” backend \u0026 Parse A few months ago, Facebook announced that Parse’s free tier will include 30 requests / second. That’s over 2.5 million requests per day, which is quite a lot – probably enough to run the majority of websites on the internet. It seemed too good to be true, so I had to try it myself.\nIt took a few hours to convert the Django webserver/frontend code to Parse. This was fairly straightforward, and it had the added advantages of getting rid of some deployment scripts and having a more solid development environment. Parse supplies a command-line tool for deployment that constantly syncs the code to an app that is identical to the production app – much better than the Fabric script I had.\nThe disadvantages of the move to Parse were having to rewrite some of the backend in JavaScript (= less readable than Python), and a more complex data sync command (no longer just copying a big SQLite file). However, I would definitely use it for other projects because of the generous free tier, the availability of APIs for all major platforms, and the elimination of most operational concerns.\nCurrent iteration: Goodbye Django, hello BCRecommender With the Django webserver out of the way, there was little use left for Django in the project. It took a few more hours to get rid of it, replacing the management commands with Commandr, and the SQLite database with MongoDB (wrapped with the excellent MongoEngine, which has matured a lot in recent years). MongoDB has become a more natural choice now, since it is the database used by Parse. I expect this setup of a local Python backend and a Parse frontend to work quite well (and remain virtually free) for the foreseeable future.\nThe only fixed cost I now have comes from registering the bcrecommender.com domain and managing it with Route 53. This wasn’t required when I was running it only for myself, and I could have just kept it under bcrecommender.parseapp.com, but I think it would be useful for other Bandcamp users. I would also like to use it as a training lab to improve my (poor) marketing skills – not having a dedicated domain just looks bad.\nIn summary, it’s definitely possible to build simple projects and host them for free. It also looks like my approach would scale way beyond the current BCRecommender volume. The next post in this series will cover some of the algorithms and general considerations of building the recommender system.\n","wordCount":"1021","inLanguage":"en","image":"https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture.png","datePublished":"2014-09-07T10:48:44Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)</h1><div class=post-meta><span title='2014-09-07 10:48:44 +0000 UTC'>September 7, 2014</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture_hu30771a1e5f4a580acd5b458b23f57625_45736_360x0_resize_box_3.png 360w ,https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture_hu30771a1e5f4a580acd5b458b23f57625_45736_480x0_resize_box_3.png 480w ,https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture_hu30771a1e5f4a580acd5b458b23f57625_45736_720x0_resize_box_3.png 720w ,https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture_hu30771a1e5f4a580acd5b458b23f57625_45736_1080x0_resize_box_3.png 1080w ,https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture.png 1176w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/bcrecommender-architecture.png alt width=1176 height=526></figure><div class=post-content><p class=intro-note>This is the second part of a series of posts on my <a href=http://www.bcrecommender.com target=_blank rel=noopener>BCRecommender – personalised Bandcamp recommendations</a> project. Check out <a href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/>the first part</a> for the general motivation behind this project.</p><p><a href=http://www.bcrecommender.com target=_blank rel=noopener>BCRecommender</a> is a hobby project whose main goal is to help me find music I like on <a href=https://bandcamp.com target=_blank rel=noopener>Bandcamp</a>. Its secondary goal is to serve as a testing ground for ideas I have and things I&rsquo;d like to explore.<br>One question I&rsquo;ve been wondering about is: how much money does one need to spend on infrastructure for a simple web-based product before it reaches meaningful traffic?<br>The answer is: not much at all. It can easily be done for less than $1 per month.<br>This post discusses my exploration of this question by describing the main components of the BCRecommender system, without getting into the algorithms that drive it (which will be covered in subsequent posts).</p><p>The general flow of BCRecommender is fairly simple: crawl publicly-available data from Bandcamp (fan collections and tracks/albums = tralbums), generate recommendations based on this data (static lists of tralbums indexed by fan for personalised recommendations and by tralbum for similarity), and present the recommendations to users in a way that&rsquo;s easy to browse and explore (since we&rsquo;re dealing with music it must be playable, which is easy to achieve by embedding Bandcamp&rsquo;s iframes).</p><h3 id=first-iteration-django--aws>First iteration: Django & AWS<a hidden class=anchor aria-hidden=true href=#first-iteration-django--aws>#</a></h3><p>The first iteration of the project was implemented as a <a href=https://www.djangoproject.com/ target=_blank rel=noopener>Django</a> project. Having never built a Django project from scratch, I figured this would be a good way to learn how it&rsquo;s done properly. One thing I was keen on learning was using the Django ORM with an SQL database (in the past I&rsquo;ve worked with Django and <a href=https://www.mongodb.org/ target=_blank rel=noopener>MongoDB</a>). This ended up working less smoothly than I expected, perhaps because I&rsquo;m too used to MongoDB, or because SQL forces you to model your data in unnatural ways, or because I insisted on using <a href=https://sqlite.org/ target=_blank rel=noopener>SQLite</a> for simplicity. Whatever it was, I quickly started missing MongoDB, despite its flaws.</p><p>I chose <a href=https://aws.amazon.com/ target=_blank rel=noopener>AWS</a> for hosting because my personal account was under the free tier, and using a micro instance is more than enough for serving a website with no traffic. I considered <a href=https://developers.google.com/appengine/ target=_blank rel=noopener>Google App Engine</a> with its indefinite free tier, but after reading the docs I realised I don&rsquo;t want to jump through so many hoops to use their system – Google&rsquo;s free tier was likely to cost too much in pain and time.</p><p>While an AWS micro instance is enough for <em>serving</em> the recommendations, it&rsquo;s not enough for generating them. Rather than paying Amazon for another instance, I figured that using spare capacity on my own laptop (quad-core with 16GB of RAM) would be good enough. So the backend worker for BCRecommender ended up being a local virtual machine using one core and 4GB of RAM.</p><p>After some coding I had a nice setup in place:</p><ul><li>AWS webserver running Django with SQLite as the database layer and a simple frontend, styled with <a href=http://getbootstrap.com/ target=_blank rel=noopener>Bootstrap</a></li><li>Local backend worker running <a href=http://www.celeryproject.org/ target=_blank rel=noopener>Celery</a> under <a href=http://supervisord.org/ target=_blank rel=noopener>Supervisor</a> to collect the data (with errors reported to a dedicated Gmail account), Dropbox for backups, and Django management commands to generate the recommendations</li><li>Code and issue tracker hosted on <a href=https://bitbucket.org/ target=_blank rel=noopener>Bitbucket</a> (which provides free private repositories)</li><li><a href=http://www.fabfile.org/ target=_blank rel=noopener>Fabric</a> scripts for deployments to the AWS webserver and the local backend worker (including database sync as one big SQLite file)</li><li>Local virtual machine for development (provisioned with <a href=http://www.vagrantup.com/ target=_blank rel=noopener>Vagrant</a>)</li></ul><p>This system wasn&rsquo;t going to scale, but I didn&rsquo;t care. I just used it to discover new music, and it worked. I didn&rsquo;t even bother registering a domain name, so it was all running for free.</p><h3 id=second-iteration-django-backend--parse>Second iteration: &ldquo;Django&rdquo; backend & Parse<a hidden class=anchor aria-hidden=true href=#second-iteration-django-backend--parse>#</a></h3><p>A few months ago, <a href=http://blog.parse.com/2014/04/30/parse-pricing-now-cheaper-and-simpler/ target=_blank rel=noopener>Facebook announced that Parse&rsquo;s free tier will include 30 requests / second</a>. That&rsquo;s over 2.5 million requests per day, which is quite a lot – probably enough to run the majority of websites on the internet. It seemed too good to be true, so I had to try it myself.</p><p>It took a few hours to convert the Django webserver/frontend code to Parse. This was fairly straightforward, and it had the added advantages of getting rid of some deployment scripts and having a more solid development environment. Parse supplies a command-line tool for deployment that constantly syncs the code to an app that is identical to the production app – much better than the Fabric script I had.</p><p>The disadvantages of the move to Parse were having to rewrite some of the backend in JavaScript (= less readable than Python), and a more complex data sync command (no longer just copying a big SQLite file). However, I would definitely use it for other projects because of the generous free tier, the availability of APIs for all major platforms, and the elimination of most operational concerns.</p><h3 id=current-iteration-goodbye-django-hello-bcrecommender>Current iteration: Goodbye Django, hello BCRecommender<a hidden class=anchor aria-hidden=true href=#current-iteration-goodbye-django-hello-bcrecommender>#</a></h3><p>With the Django webserver out of the way, there was little use left for Django in the project. It took a few more hours to get rid of it, replacing the management commands with <a href=https://github.com/tellapart/commandr target=_blank rel=noopener>Commandr</a>, and the SQLite database with MongoDB (wrapped with the excellent <a href=http://mongoengine.org/ target=_blank rel=noopener>MongoEngine</a>, which has matured a lot in recent years). MongoDB has become a more natural choice now, since it is the database used by Parse. I expect this setup of a local Python backend and a Parse frontend to work quite well (and remain virtually free) for the foreseeable future.</p><p>The only fixed cost I now have comes from registering the <a href=http://www.bcrecommender.com target=_blank rel=noopener>bcrecommender.com domain</a> and managing it with Route 53. This wasn&rsquo;t required when I was running it only for myself, and I could have just kept it under bcrecommender.parseapp.com, but I think it would be useful for other Bandcamp users. I would also like to use it as a training lab to improve my (poor) marketing skills – not having a dedicated domain just looks bad.</p><p>In summary, it&rsquo;s definitely possible to build simple projects and host them for free. It also looks like my approach would scale way beyond the current BCRecommender volume. The next post in this series will cover some of the algorithms and general considerations of building the recommender system.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/recommender-systems/>Recommender Systems</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on x" href="https://x.com/intent/tweet/?text=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f&amp;hashtags=Bandcamp%2cBCRecommender%2cDevOps%2crecommendersystems%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f&amp;title=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29&amp;summary=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f&title=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on whatsapp" href="https://api.whatsapp.com/send?text=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on telegram" href="https://telegram.me/share/url?text=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Building%20a%20recommender%20system%20on%20a%20shoestring%20budget%20%28or%3a%20BCRecommender%20part%202%20%e2%80%93%20general%20system%20layout%29&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f07%2fbuilding-a-recommender-system-on-a-shoestring-budget%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/index.html b/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/index.html
index 1d2a8a2f3..04c1c0899 100644
--- a/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/index.html
+++ b/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/psychedelic-progressive-rock-tag-cloud.png 960w," src=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/psychedelic-progressive-rock-tag-cloud_hu1f3068901702be9a79eb77586e667f5e_136685_800x0_resize_box_3.png alt="psychedelic-progressive-rock tag cloud" loading=lazy></a></figure><p>Using LDA for generating recommendations is straightforward, as each fan can be represented as the concatenation of the tags assigned to their tralbums, together with their own user tag. This representation is then converted to a topic distribution, which is compared to all the tralbums to yield the most similar ones.</p><p>This approach yielded much better results than collaborative filtering. I actually found albums I like and made some purchases, more than just the three that are annotated on <a href=https://bandcamp.com/yanir target=_blank rel=noopener>my fan page</a> (I didn&rsquo;t want to be too spammy). Woohoo!</p><p>However, the problem with this approach is that it doesn&rsquo;t take my mood into account, as it is based on my entire profile. To address this, I introduced similar music and cluster-based discovery.</p><h3 id=beyond-static-personalisation-similar-music-and-cluster-based-discovery>Beyond static personalisation: similar music and cluster-based discovery<a hidden class=anchor aria-hidden=true href=#beyond-static-personalisation-similar-music-and-cluster-based-discovery>#</a></h3><p>It is easy to see that the LDA-based tralbum representation makes it straightforward to calculate similarity between tralbums, and also explore tralbums that belong to the same topic/cluster. Adding this functionality to BCRecommender means that users can explore similar tralbums to a tralbum or a cluster in the style that they are interested in <em>right now</em> – based on their mood. Implementing these features helped me find more music I like, so again, I&rsquo;m happy.</p><p>Tweaking the similarity algorithms is still a work in progress, as is finding a scalable way to generate useful <a href=http://www.bcrecommender.com/spotlights target=_blank rel=noopener>cluster/spotlight pages</a>. However, my focus now (in the time that I can allocate to working on this project) is on getting some people using it and iterate following their feedback.</p><h3 id=future-extensions>Future extensions<a hidden class=anchor aria-hidden=true href=#future-extensions>#</a></h3><p>It would be awesome to make BCRecommender&rsquo;s discovery process smoother. For example, it&rsquo;d be fairly straightforward to just stream all the recommendations rather than making users click album by album (like Pandora, Spotify, etc.). Iterating on the above approaches to improve the user experience is also likely to yield good results.</p><p>However, as mentioned above, my current focus is on getting more people to use BCRecommender. While the target audience is rather small, it doesn&rsquo;t matter because I&rsquo;m not trying to make money from this project. I am certain that many fans would discover new music using the website. At this stage, I just need to get them to visit, which is something that I will write about in future posts.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/music/>Music</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/recommender-systems/>Recommender Systems</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on x" href="https://x.com/intent/tweet/?text=Bandcamp%20recommendation%20and%20discovery%20algorithms&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f&amp;hashtags=Bandcamp%2cBCRecommender%2cdatascience%2cmusic%2cpredictivemodelling%2crecommendersystems"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f&amp;title=Bandcamp%20recommendation%20and%20discovery%20algorithms&amp;summary=Bandcamp%20recommendation%20and%20discovery%20algorithms&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f&title=Bandcamp%20recommendation%20and%20discovery%20algorithms"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on whatsapp" href="https://api.whatsapp.com/send?text=Bandcamp%20recommendation%20and%20discovery%20algorithms%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on telegram" href="https://telegram.me/share/url?text=Bandcamp%20recommendation%20and%20discovery%20algorithms&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bandcamp recommendation and discovery algorithms on ycombinator" href="https://news.ycombinator.com/submitlink?t=Bandcamp%20recommendation%20and%20discovery%20algorithms&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f19%2fbandcamp-recommendation-and-discovery-algorithms%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/index.html b/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/index.html
index 2838c4405..78d367c5c 100644
--- a/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/index.html
+++ b/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Bandcamp,BCRecommender,business,marketing,recommender systems,traction book"><meta name=description content="Ranking 19 channels with the goal of getting traction for BCRecommender."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Applying the Traction Book’s Bullseye framework to BCRecommender"><meta property="og:description" content="Ranking 19 channels with the goal of getting traction for BCRecommender."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-09-24T04:57:39+00:00"><meta property="article:modified_time" content="2023-07-06T09:28:02+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Applying the Traction Book’s Bullseye framework to BCRecommender"><meta name=twitter:description content="Ranking 19 channels with the goal of getting traction for BCRecommender."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Applying the Traction Book’s Bullseye framework to BCRecommender","item":"https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Applying the Traction Book’s Bullseye framework to BCRecommender","name":"Applying the Traction Book’s Bullseye framework to BCRecommender","description":"Ranking 19 channels with the goal of getting traction for BCRecommender.","keywords":["Bandcamp","BCRecommender","business","marketing","recommender systems","traction book"],"articleBody":" This is the fourth part of a series of posts on my Bandcamp recommendations (BCRecommender) project. Check out previous posts on the general motivation behind this project, the system's architecture, and the recommendation algorithms. Having used BCRecommender to find music I like, I’m certain that other Bandcamp fans would like it too. It could probably be extended to attract a wider audience of music lovers, but for now, just getting feedback from Bandcamp fans would be enough. There are about 200,000 fans that I know of – getting even a fraction of them to use and comment on BCRecommender would serve as a good guide to what’s worth building and improving.\nIn addition to getting feedback, the personal value for me in getting BCRecommender users is learning some general lessons on traction building. Like many technical people, I like building products and playing with data, but I don’t really enjoy sales and marketing (and that’s an understatement). One of my goals in working independently is forcing myself to get better at the things I’m not good at. To that end, I recently started reading Traction: A Startup Guide to Getting Customers by Gabriel Weinberg and Justin Mares.\nThe Traction book identifies 19 different channels for getting traction, and suggests a simple framework (named Bullseye) to ranking and quickly exploring the channels. They explain that many technical founders tend to focus on traction channels they’re familiar with, and that the effort invested in those channels tends to be rather small compared to the investment in building the product. The authors rightly note that “Almost every failed startup has a product. What failed startups don’t have is traction – real customer growth.” They argue that following a rigorous approach to gaining traction via their framework is likely to improve a startup’s chances of success. From personal experience, this is very likely to be true.\nThe key steps in the Bullseye framework are brainstorming ideas for each traction channel, ranking the channels into tiers, prioritising the most promising ones, testing them, and focusing on the channels that work. This is not a one-off process – channel suitability changes over time, and one needs to go through the process repeatedly as the product evolves and traction grows.\nHere are the traction channels, ordered in the same order as in the book. Each traction channel is marked with a letter denoting its ranking tier from A (most appropriate) to C (unsuitable right now). A short explanation is provided for each channel.\n[B] viral marketing: everyone wants to go viral, but at the moment I don’t have a good-enough understanding of my target audience to seriously pursue this channel. [C] public relations (PR): I don’t think that PR would give me access to the kind of focused user group I need at this phase. [C] unconventional PR: same as conventional PR. [C] search engine marketing (SEM): may work, but I don’t want to spend money at this stage. [C] social and display ads: see SEM. [C] offline ads: see SEM. [A] search engine optimization (SEO): this channel seems promising, as ranking highly for queries such as “bandcamp recommendations” should drive quality traffic that is likely to convert (i.e., play recommendations and sign up for updates). It doesn’t seem like “bandcamp recommendations” is a very competitive query, so it’s definitely worth doing some SEO work. [A] content marketing: I think that there’s definitely potential in this channel, since I have a lot of data that can be explored and presented in interesting ways. The problem is creating content that is compelling enough to attract people. I started playing with this channel via the Spotlights feature, but it’s not good enough yet. [B] email marketing: BCRecommender already has the subscription feature for retention. At this stage, this doesn’t seem like a viable acquisition channel. [B] engineering as marketing: this channel sounds promising, but I don’t have good ideas for it at the moment. This may change soon, as I’m currently reading this chapter. [A] targeting blogs: this approach should work for getting high-quality feedback, and help SEO as well. [C] business development: there may be some promising ideas in this channel, but only worth pursuing later. [C] sales: not much to sell. [C] affiliate programs: I’m not going to pay affiliates as I’m not making any money. [B] existing platforms: in a way, I’m already building on top of the existing Bandcamp platform. One way of utilising it for growth is by getting fans to link to BCRecommender when it leads to sales (as I’ve done on my fan page), but that would be more feasible at a later stage with more active users. [C] trade shows: I find it hard to think of trade shows where there are many Bandcamp fans. [C] offline events: probably easier than trade shows (think concerts/indie events), but doesn’t seem worth pursuing at this stage. [C] speaking engagements: similar to offline events. I do speaking engagements, and I’m actually going to mention BCRecommender as a case study at my workshop this week, but the intersection between Bandcamp fans and people interested in data science seems rather small. [C] community building: this may be possible later on, when there is a core group of loyal users. However, some aspects of community building are provided by Bandcamp and I don’t want to compete with them. Cool, writing everything up explicitly was actually helpful! The next step is to test the three channels that ranked the highest: SEO, content marketing and targeting blogs. I will report the results in future posts.\n","wordCount":"926","inLanguage":"en","datePublished":"2014-09-24T04:57:39Z","dateModified":"2023-07-06T09:28:02+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Applying the Traction Book’s Bullseye framework to BCRecommender</h1><div class=post-meta><span title='2014-09-24 04:57:39 +0000 UTC'>September 24, 2014</span></div></header><div class=post-content><p class=intro-note>This is the fourth part of a series of posts on my <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp recommendations (BCRecommender)</a> project. Check out previous posts on <a href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/>the general motivation behind this project</a>, <a href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/>the system's architecture</a>, and <a href=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/>the recommendation algorithms</a>.</p><p>Having used BCRecommender to find music I like, I&rsquo;m certain that other Bandcamp fans would like it too. It could probably be extended to attract a wider audience of music lovers, but for now, just getting feedback from Bandcamp fans would be enough. There are about 200,000 fans that I know of – getting even a fraction of them to use and comment on BCRecommender would serve as a good guide to what&rsquo;s worth building and improving.</p><p>In addition to getting feedback, the personal value for me in getting BCRecommender users is learning some general lessons on traction building. Like many technical people, I like building products and playing with data, but I don&rsquo;t really enjoy sales and marketing (and that&rsquo;s an understatement). One of my goals in working independently is forcing myself to get better at the things I&rsquo;m not good at. To that end, I recently started reading <a href=http://tractionbook.com/ target=_blank rel=noopener>Traction: A Startup Guide to Getting Customers</a> by Gabriel Weinberg and Justin Mares.</p><p>The Traction book identifies 19 different channels for getting traction, and suggests a simple framework (named Bullseye) to ranking and quickly exploring the channels. They explain that many technical founders tend to focus on traction channels they&rsquo;re familiar with, and that the effort invested in those channels tends to be rather small compared to the investment in building the product. The authors rightly note that &ldquo;Almost every failed startup has a product. What failed startups don&rsquo;t have is traction – real customer growth.&rdquo; They argue that following a rigorous approach to gaining traction via their framework is likely to improve a startup&rsquo;s chances of success. From personal experience, this is very likely to be true.</p><p>The key steps in the Bullseye framework are brainstorming ideas for each traction channel, ranking the channels into tiers, prioritising the most promising ones, testing them, and focusing on the channels that work. This is not a one-off process – channel suitability changes over time, and one needs to go through the process repeatedly as the product evolves and traction grows.</p><p>Here are the traction channels, ordered in the same order as in the book. Each traction channel is marked with a letter denoting its ranking tier from A (most appropriate) to C (unsuitable right now). A short explanation is provided for each channel.</p><ul><li>[B] <strong>viral marketing:</strong> everyone wants to go viral, but at the moment I don&rsquo;t have a good-enough understanding of my target audience to seriously pursue this channel.</li><li>[C] <strong>public relations (PR):</strong> I don&rsquo;t think that PR would give me access to the kind of focused user group I need at this phase.</li><li>[C] <strong>unconventional PR:</strong> same as conventional PR.</li><li>[C] <strong>search engine marketing (SEM):</strong> may work, but I don&rsquo;t want to spend money at this stage.</li><li>[C] <strong>social and display ads:</strong> see SEM.</li><li>[C] <strong>offline ads:</strong> see SEM.</li><li>[A] <strong>search engine optimization (SEO):</strong> this channel seems promising, as ranking highly for queries such as &ldquo;bandcamp recommendations&rdquo; should drive quality traffic that is likely to convert (i.e., play recommendations and sign up for updates). It doesn&rsquo;t seem like &ldquo;bandcamp recommendations&rdquo; is a very competitive query, so it&rsquo;s definitely worth doing some SEO work.</li><li>[A] <strong>content marketing:</strong> I think that there&rsquo;s definitely potential in this channel, since I have a lot of data that can be explored and presented in interesting ways. The problem is creating content that is compelling enough to attract people. I started playing with this channel via the <a href=http://www.bcrecommender.com/spotlights target=_blank rel=noopener>Spotlights feature</a>, but it&rsquo;s not good enough yet.</li><li>[B] <strong>email marketing:</strong> BCRecommender already has the subscription feature for retention. At this stage, this doesn&rsquo;t seem like a viable acquisition channel.</li><li>[B] <strong>engineering as marketing:</strong> this channel sounds promising, but I don&rsquo;t have good ideas for it at the moment. This may change soon, as I&rsquo;m currently reading this chapter.</li><li>[A] <strong>targeting blogs:</strong> this approach should work for getting high-quality feedback, and help SEO as well.</li><li>[C] <strong>business development:</strong> there may be some promising ideas in this channel, but only worth pursuing later.</li><li>[C] <strong>sales:</strong> not much to sell.</li><li>[C] <strong>affiliate programs:</strong> I&rsquo;m not going to pay affiliates as I&rsquo;m not making any money.</li><li>[B] <strong>existing platforms:</strong> in a way, I&rsquo;m already building on top of the existing Bandcamp platform. One way of utilising it for growth is by getting fans to link to BCRecommender when it leads to sales (<a href=https://bandcamp.com/yanir target=_blank rel=noopener>as I&rsquo;ve done on my fan page</a>), but that would be more feasible at a later stage with more active users.</li><li>[C] <strong>trade shows:</strong> I find it hard to think of trade shows where there are many Bandcamp fans.</li><li>[C] <strong>offline events:</strong> probably easier than trade shows (think concerts/indie events), but doesn&rsquo;t seem worth pursuing at this stage.</li><li>[C] <strong>speaking engagements:</strong> similar to offline events. I do speaking engagements, and I&rsquo;m actually going to mention BCRecommender as a case study at <a href=https://generalassemb.ly/education/demystifying-data-an-introduction-to-data-science/sydney/7692 target=_blank rel=noopener>my workshop this week</a>, but the intersection between Bandcamp fans and people interested in data science seems rather small.</li><li>[C] <strong>community building:</strong> this may be possible later on, when there is a core group of loyal users. However, some aspects of community building are provided by Bandcamp and I don&rsquo;t want to compete with them.</li></ul><p>Cool, writing everything up explicitly was actually helpful! The next step is to test the three channels that ranked the highest: SEO, content marketing and targeting blogs. I will report the results in future posts.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/recommender-systems/>Recommender Systems</a></li><li><a href=https://yanirseroussi.com/tags/traction-book/>Traction Book</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on x" href="https://x.com/intent/tweet/?text=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f&amp;hashtags=Bandcamp%2cBCRecommender%2cbusiness%2cmarketing%2crecommendersystems%2ctractionbook"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f&amp;title=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender&amp;summary=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f&title=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on whatsapp" href="https://api.whatsapp.com/send?text=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on telegram" href="https://telegram.me/share/url?text=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Applying the Traction Book’s Bullseye framework to BCRecommender on ycombinator" href="https://news.ycombinator.com/submitlink?t=Applying%20the%20Traction%20Book%e2%80%99s%20Bullseye%20framework%20to%20BCRecommender&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f09%2f24%2fapplying-the-traction-books-bullseye-framework-to-bcrecommender%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/index.html b/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/index.html
index a85ca68b1..ba1cdf728 100644
--- a/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/index.html
+++ b/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/wise2014-connected-components.png 769w," src=https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/wise2014-connected-components.png alt="wise2014 connected components" loading=lazy></a></figure><p>My best submission ended up being a simple weighted linear combination of three models. All these models are hierarchical ensembles, where a linear classifier distinguishes between connected components, and the base models are trained on texts from a single connected component. These base models are:</p><ol><li><a href=http://www.cms.waikato.ac.nz/~ml/publications/2009/chains.pdf target=_blank rel=noopener>Ensemble of classifier chains</a> (ECC) with linear classifiers (<a href=http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html target=_blank rel=noopener>SGDClassifier from scikit-learn</a>) trained for each label, using hinge loss and L1 penalty</li><li>Same as 1, but with modified Huber loss</li><li>A linear classifier with modified Huber loss and L1 penalty that predicts single label probabilities</li></ol><p>For each test document, each one of these base models yields a score for each label. These scores are weighted and thresholded to yield the final predictions.</p><p>It was interesting to learn that a relatively-simple model like ECC yields competitive results. The basic idea behind ECC is to combine different <em>classifier chains</em>. Each classifier chain is also an ensemble where each base classifier is trained to predict a single label. The input for each classifier in the chain depends on the output of preceding classifiers, so it encodes dependencies between labels. For example, if label 2 always appears with label 1 and the label 1 classifier precedes the label 2 classifier in the chain, the label 2 classifier is able to use this dependency information directly, which should increase its accuracy (though it is affected by misclassifications by the label 1 classifier). See <a href=http://www.cms.waikato.ac.nz/~ml/publications/2009/chains.pdf target=_blank rel=noopener>Read et al.&rsquo;s paper</a> for a more in-depth explanation.</p><p>Another notable observation is that L1 penalty worked well, which is not too surprising when considering the fact that the dataset has 300K features and many of them are probably irrelevant to prediction (L1 penalty yields sparse models where many features get zero weight).</p><h3 id=things-that-didnt-work>Things that didn&rsquo;t work<a hidden class=anchor aria-hidden=true href=#things-that-didnt-work>#</a></h3><p>As I was travelling, I didn&rsquo;t have much time to work on this competition over its two final weeks (though this was a good way of passing the time on long flights). One thing that I tried was understanding some of the probabilistic classifier chain (PCC) code out there by porting it to Python, but the results were very disappointing, probably due to bugs in my code. I expected PCC to work well, especially with <a href=http://papers.nips.cc/paper/4389-an-exact-algorithm-for-f-measure-maximization target=_blank rel=noopener>the extension for optimising the F-measure</a>. Figuring out how to run the Java code would have probably been a better use of my time than porting the code to Python.</p><p>I also played with reverse-engineering the features back to counts, but it was problematic since the feature values are normalised. It was disappointing that we weren&rsquo;t at least given the bag of words representations. I also attempted to reduce the feature representation with <a href=https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation target=_blank rel=noopener>latent Dirichlet allocation</a>, but it didn&rsquo;t perform well – possibly because I couldn&rsquo;t get the correct word counts.</p><h3 id=conclusion>Conclusion<a hidden class=anchor aria-hidden=true href=#conclusion>#</a></h3><p>Overall, this was a fun competition. Despite minor issues with the data and not having enough time to do everything I wanted to do, it was a great learning experience. From reading <a href=http://www.kaggle.com/c/wise-2014/forums/t/9773/our-approach-5th-place/50995 target=_blank rel=noopener>the summaries by the other teams</a>, it appears that other competitors enjoyed it too. As always, I highly recommend Kaggle competitions to beginners who are trying to learn more about the field of data science and predictive modelling, and to more experienced data scientists who want to improve their skills.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/multi-label-classification/>Multi-Label Classification</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on x" href="https://x.com/intent/tweet/?text=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f&amp;hashtags=datascience%2cKaggle%2cKagglecompetition%2cmulti-labelclassification%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f&amp;title=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach&amp;summary=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f&title=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on whatsapp" href="https://api.whatsapp.com/send?text=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on telegram" href="https://telegram.me/share/url?text=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Greek Media Monitoring Kaggle competition: My approach on ycombinator" href="https://news.ycombinator.com/submitlink?t=Greek%20Media%20Monitoring%20Kaggle%20competition%3a%20My%20approach&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f07%2fgreek-media-monitoring-kaggle-competition-my-approach%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/10/23/what-is-data-science/index.html b/2014/10/23/what-is-data-science/index.html
index 1fa77d2c2..ae5600455 100644
--- a/2014/10/23/what-is-data-science/index.html
+++ b/2014/10/23/what-is-data-science/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2014/10/23/what-is-data-science/data-skill-continuum.png 981w," src=https://yanirseroussi.com/2014/10/23/what-is-data-science/data-skill-continuum_hude8f4ba53ab678a51f562b1a637a59bc_5172_800x0_resize_box_3.png alt="Data skill continuum" loading=lazy></a></figure><p>This continuum contains two additional roles, which are often confused with data scientists:</p><ul><li><em>Data engineer:</em> a software engineer that deals with data plumbing (traditional database setup, Hadoop, Spark and all the rest)</li><li><em>Data analyst:</em> a person who digs into data to surface insights, but lacks the skills to do so at scale (e.g., they know how to use Excel, Tableau and SQL but can&rsquo;t build a web app from scratch)</li></ul><p><em>Data science mixes all these roles</em>. Because of this, there are few true data science positions for people with no work experience. A successful data scientist needs to be able to &ldquo;become one with the data&rdquo; by exploring it and applying rigorous statistical analysis (right-hand side of the continuum). But good data scientists also understand what it takes to deploy production systems, and are ready to get their hands dirty by writing code that cleans up the data or performs core system functionality (left-hand side of the continuum). Gaining all these skills takes time. It is still somewhat rare to find people who are true data scientists according to this definition, which is why <a href=http://hortonworks.com/blog/hortonworks-hadoop-data-science/ target=_blank rel=noopener>Ofer Mendelevitch&rsquo;s post</a> recommends building teams that consist of people with skills from both sides of the continuum.</p><h3 id=how-is-data-science-different-from-just-science>How is data science different from just science?<a hidden class=anchor aria-hidden=true href=#how-is-data-science-different-from-just-science>#</a></h3><p>Data is everywhere. Extracting knowledge from data is an essential part of any science. Hence, the name <em>data science</em> doesn&rsquo;t really capture what&rsquo;s new about the field. The way I see it, the novelty of data science comes from the application of software to model any type of data in a way that generalises across domains. So while a physicist may use software to build models based on data, they won&rsquo;t become a data scientist until they&rsquo;ve gone and applied these skills to other fields (as many physicists end up doing). As <a href=http://www.kaggle.com target=_blank rel=noopener>Kaggle</a> shows, data scientists can work on a wide variety of problems – from biology and physics to marketing, text mining and web search personalisation. It&rsquo;s often the case in Kaggle competitions that the same people apply similar techniques to very different problems, obtaining results that significantly improve on the state of the art.</p><p>However, domain experts such as physicists aren&rsquo;t going to be made redundant any time soon. Contrary to what Kaggle may have you believe, there is much more to data science than predictive modelling on a well-defined problem. Data scientists typically spend much of their time working with domain experts to define the problem, and chasing down diverse data sources to extract features that enable predictive modelling (also known as &ldquo;the fun part&rdquo;). Despite the existence of these less-glamorous aspects of data science, there&rsquo;s still a lot of fun to be had working in the area. I highly recommend getting into data science to people who enjoy such challenges.</p><p>Getting started as a data scientist is actually pretty simple: become a software engineer, become a data analyst, learn how to model data using software (e.g., by participating in Kaggle competitions), and find a job as a data scientist. Obviously, <a href=http://norvig.com/21-days.html target=_blank rel=noopener>it&rsquo;s not going to happen overnight</a>. It took me around 10 ten years, and I&rsquo;m still learning.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on x" href="https://x.com/intent/tweet/?text=What%20is%20data%20science%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f&amp;hashtags=datascience%2cKaggle%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f&amp;title=What%20is%20data%20science%3f&amp;summary=What%20is%20data%20science%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f&title=What%20is%20data%20science%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on whatsapp" href="https://api.whatsapp.com/send?text=What%20is%20data%20science%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on telegram" href="https://telegram.me/share/url?text=What%20is%20data%20science%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share What is data science? on ycombinator" href="https://news.ycombinator.com/submitlink?t=What%20is%20data%20science%3f&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f10%2f23%2fwhat-is-data-science%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-369><div class=comment-header><a href=#comment-369><img class=comment-avatar src="https://www.gravatar.com/avatar/9b3a77da2c480472f781ea3d04c9a282?s=50"><p class=comment-info><strong>Thomas Packer (@ThomasPacker)</strong><br><small>2015-05-18 17:53:00</small></p></a></div><div class="comment-body post-content">So true. Thanks for saying it so well.</div></div><div class=comment-level-0 id=comment-1485><div class=comment-header><a href=#comment-1485><img class=comment-avatar src="https://www.gravatar.com/avatar/5002a9dea83daf8e07d7218d646b5c80?s=50"><p class=comment-info><strong>scottedwards2000</strong><br><small>2017-03-26 23:30:55</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=https://orderofsql.wordpress.com/2017/03/26/what-is-data-science/ rel=nofollow>The Order of SQL</a>.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/11/05/bcrecommender-traction-update/index.html b/2014/11/05/bcrecommender-traction-update/index.html
index 9778ed6f8..80567123e 100644
--- a/2014/11/05/bcrecommender-traction-update/index.html
+++ b/2014/11/05/bcrecommender-traction-update/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Bandcamp,BCRecommender,business,marketing,music,traction book"><meta name=description content="Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="BCRecommender Traction Update"><meta property="og:description" content="Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/"><meta property="og:image" content="https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/bullseye.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-11-05T02:29:35+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/bullseye.png"><meta name=twitter:title content="BCRecommender Traction Update"><meta name=twitter:description content="Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"BCRecommender Traction Update","item":"https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"BCRecommender Traction Update","name":"BCRecommender Traction Update","description":"Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing.","keywords":["Bandcamp","BCRecommender","business","marketing","music","traction book"],"articleBody":" This is the fifth part of a series of posts on my Bandcamp recommendations (BCRecommender) project. Check out previous posts on the general motivation behind this project, the system’s architecture, the recommendation algorithms, and initial traction planning. In a previous post, I discussed my plans to apply the Bullseye framework from the Traction Book to BCRecommender, my Bandcamp recommendations project. In that post, I reviewed the 19 traction channels described in the book, and decided to focus on the three most promising ones: blogger outreach, search engine optimisation (SEO), and content marketing. This post discusses my progress to date.\nGoals My initial traction goals were rather modest: get some feedback from real people, build up steady nonzero traffic to the site, and then increase that traffic to 10+ unique visitors per day. It’s worth noting that I have four other main areas of focus at the moment, so BCRecommender is not getting all the attention I could potentially give it. Nonetheless, I have made good progress on achieving my goals (first two have been obtained, but traffic still fluctuates), and learnt a lot in the process.\nThings that worked Blogger outreach. The most obvious people to contact are existing Bandcamp fans. It was straightforward to generate a list of prolific fans with blogs, as Bandcamp allows people to populate their profile with a short bio and links to their sites. I worked my way through part of the list, sending each fan an email introducing BCRecommender and asking for their feedback. Each email required some manual work, as the vast majority of people don’t have their email address listed on their Bandcamp profile page. I was careful not to be too spammy, which seemed to work: about 50% of the people I contacted visited BCRecommender, 20% responded with positive feedback, and 10% linked to BCRecommender in some form, with the largest volume of traffic coming from my Hypebot guest post. The problem with this approach is that it doesn’t scale, but the most valuable thing I got out of it was that people like the project and that there’s a real need for it.\nTwitter. I’m not sure where Twitter falls as a traction channel. It’s probably somewhere between (micro)blogger outreach and content marketing. However you categorise Twitter, it has been working well as a source of traffic. Simply finding people who may be interested in BCRecommender and tweeting related content has proven to be a rather low-effort way of getting attention, which is great at this stage. I have a few ideas for driving more traffic from Twitter, which I will try as I go.\nThings that didn’t work Content marketing. I haven’t really spent time doing serious content marketing apart from the Spotlights pilot. My vision for the spotlights was to generate quality articles automatically and showcase music on Bandcamp in an engaging way that helps people discover new artists, even if they don’t have a fan account. However, full automation of the spotlight feature would require a lot of work, and I think that there are lower-hanging fruits that I should focus on first. For example, finding interesting insights in the data and presenting them in an engaging way may be a better content strategy, as it would be unique to BCRecommender. For the spotlights, partnering with bloggers to write the articles may be a better approach than automation.\nSEO. I expected BCRecommender to rank higher for “bandcamp recommendations” by now, as a result of my blogger outreach efforts. At the moment, it’s still on the second page for this query on Google, though it’s the first result on Bing and DuckDuckGo. Obviously, “bandcamp recommendations” is not the only query worth ranking for, but it’s very relevant to BCRecommender, and not too competitive (half of the first page results are old forum posts). One encouraging outcome from the work done so far is that my Hypebot guest post does appear on the first page. Nonetheless, I’m still interested in getting more search engine traffic. Ranking higher would probably require adding more relevant content on the site and getting more quality links (basically what SEO is all about).\nPoints to improve and next steps I could definitely do better work on all of the above channels. Contrary to what’s suggested by the Bullseye framework, I would like to put more effort into the channels that didn’t work well. The reason is that I think they didn’t work well because of lack of attention and weak experiments, rather than due to their unsuitability to BCRecommender.\nAs mentioned above, my main limiting factor is a lack of time to spend on the project. However, there’s no pressing need to hit certain traction milestones by a specific deadline. My stretch goals are to get all Bandcamp fans to check out the project (hundreds of thousands of people), and have a significant portion of them convert by signing up to updates (tens of thousands of people). Getting there will take time. So far I’m finding the process educational and enjoyable, which is a pleasant surprise.\n","wordCount":"843","inLanguage":"en","image":"https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/bullseye.png","datePublished":"2014-11-05T02:29:35Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">BCRecommender Traction Update</h1><div class=post-meta><span title='2014-11-05 02:29:35 +0000 UTC'>November 5, 2014</span></div></header><figure class=entry-cover><img loading=eager src=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/bullseye.png alt></figure><div class=post-content><p class=intro-note>This is the fifth part of a series of posts on my <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp recommendations (BCRecommender)</a> project. Check out previous posts on <a href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/>the general motivation behind this project</a>, <a href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/>the system’s architecture</a>, <a href=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/>the recommendation algorithms</a>, and <a title="Applying the Traction Book’s Bullseye framework to BCRecommender" href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/>initial traction planning</a>.</p><p>In a previous post, I discussed <a href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/ title="Applying the Traction Book’s Bullseye framework to BCRecommender">my plans to apply the Bullseye framework from the Traction Book</a> to BCRecommender, my <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp recommendations</a> project. In that post, I reviewed the 19 traction channels described in the book, and decided to focus on the three most promising ones: blogger outreach, search engine optimisation (SEO), and content marketing. This post discusses my progress to date.</p><h3 id=goals>Goals<a hidden class=anchor aria-hidden=true href=#goals>#</a></h3><p>My initial traction goals were rather modest: get some feedback from real people, build up steady nonzero traffic to the site, and then increase that traffic to 10+ unique visitors per day. It&rsquo;s worth noting that I have four other main areas of focus at the moment, so BCRecommender is not getting all the attention I could potentially give it. Nonetheless, I have made good progress on achieving my goals (first two have been obtained, but traffic still fluctuates), and learnt a lot in the process.</p><h3 id=things-that-worked>Things that worked<a hidden class=anchor aria-hidden=true href=#things-that-worked>#</a></h3><p><strong>Blogger outreach.</strong> The most obvious people to contact are existing Bandcamp fans. It was straightforward to generate a list of prolific fans with blogs, as Bandcamp allows people to populate their profile with a short bio and links to their sites. I worked my way through part of the list, sending each fan an email introducing BCRecommender and asking for their feedback. Each email required some manual work, as the vast majority of people don&rsquo;t have their email address listed on their Bandcamp profile page. I was careful not to be too spammy, which seemed to work: about 50% of the people I contacted visited BCRecommender, 20% responded with positive feedback, and 10% linked to BCRecommender in some form, with the largest volume of traffic coming from my <a href=http://www.hypebot.com/hypebot/2014/10/personalized-bandcamp-recommendations-with-bcrecommender.html target=_blank rel=noopener>Hypebot guest post</a>. The problem with this approach is that it doesn&rsquo;t scale, but the most valuable thing I got out of it was that people like the project and that there&rsquo;s a real need for it.</p><p><strong>Twitter.</strong> I&rsquo;m not sure where Twitter falls as a traction channel. It&rsquo;s probably somewhere between (micro)blogger outreach and content marketing. However you categorise Twitter, it has been working well as a source of traffic. Simply finding people who may be interested in BCRecommender and tweeting related content has proven to be a rather low-effort way of getting attention, which is great at this stage. I have a few ideas for driving more traffic from Twitter, which I will try as I go.</p><h3 id=things-that-didnt-work>Things that didn&rsquo;t work<a hidden class=anchor aria-hidden=true href=#things-that-didnt-work>#</a></h3><p><strong>Content marketing.</strong> I haven&rsquo;t really spent time doing serious content marketing apart from the <a href=http://www.bcrecommender.com/spotlights target=_blank rel=noopener>Spotlights</a> pilot. My vision for the spotlights was to generate quality articles automatically and showcase music on Bandcamp in an engaging way that helps people discover new artists, even if they don&rsquo;t have a fan account. However, full automation of the spotlight feature would require a lot of work, and I think that there are lower-hanging fruits that I should focus on first. For example, finding interesting insights in the data and presenting them in an engaging way may be a better content strategy, as it would be unique to BCRecommender. For the spotlights, partnering with bloggers to write the articles may be a better approach than automation.</p><p><strong>SEO.</strong> I expected BCRecommender to rank higher for &ldquo;bandcamp recommendations&rdquo; by now, as a result of my blogger outreach efforts. At the moment, it&rsquo;s still on the second page for this query on Google, though it&rsquo;s the first result on Bing and <a href=http://duckduckgo.com target=_blank rel=noopener>DuckDuckGo</a>. Obviously, &ldquo;bandcamp recommendations&rdquo; is not the only query worth ranking for, but it&rsquo;s very relevant to BCRecommender, and not too competitive (half of the first page results are old forum posts). One encouraging outcome from the work done so far is that <a href=http://www.hypebot.com/hypebot/2014/10/personalized-bandcamp-recommendations-with-bcrecommender.html target=_blank rel=noopener>my Hypebot guest post</a> does appear on the first page. Nonetheless, I&rsquo;m still interested in getting more search engine traffic. Ranking higher would probably require adding more relevant content on the site and getting more quality links (basically what SEO is all about).</p><h3 id=points-to-improve-and-next-steps>Points to improve and next steps<a hidden class=anchor aria-hidden=true href=#points-to-improve-and-next-steps>#</a></h3><p>I could definitely do better work on all of the above channels. Contrary to what&rsquo;s suggested by the Bullseye framework, I would like to put more effort into the channels that didn&rsquo;t work well. The reason is that I think they didn&rsquo;t work well because of lack of attention and weak experiments, rather than due to their unsuitability to BCRecommender.</p><p>As mentioned above, my main limiting factor is a lack of time to spend on the project. However, there&rsquo;s no pressing need to hit certain traction milestones by a specific deadline. My stretch goals are to get all Bandcamp fans to check out the project (hundreds of thousands of people), and have a significant portion of them convert by signing up to updates (tens of thousands of people). Getting there will take time. So far I&rsquo;m finding the process educational and enjoyable, which is a pleasant surprise.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/music/>Music</a></li><li><a href=https://yanirseroussi.com/tags/traction-book/>Traction Book</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on x" href="https://x.com/intent/tweet/?text=BCRecommender%20Traction%20Update&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f&amp;hashtags=Bandcamp%2cBCRecommender%2cbusiness%2cmarketing%2cmusic%2ctractionbook"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f&amp;title=BCRecommender%20Traction%20Update&amp;summary=BCRecommender%20Traction%20Update&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f&title=BCRecommender%20Traction%20Update"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on whatsapp" href="https://api.whatsapp.com/send?text=BCRecommender%20Traction%20Update%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on telegram" href="https://telegram.me/share/url?text=BCRecommender%20Traction%20Update&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share BCRecommender Traction Update on ycombinator" href="https://news.ycombinator.com/submitlink?t=BCRecommender%20Traction%20Update&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f05%2fbcrecommender-traction-update%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/index.html b/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/index.html
index d701c3790..7914f43bd 100644
--- a/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/index.html
+++ b/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/gradient-boosting-out-of-bag-experiment-toy-dataset.png 858w," src=https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/gradient-boosting-out-of-bag-experiment-toy-dataset_hu02dc1ebe47af12a7ec8f5877429b5dec_71277_800x0_resize_box_3.png alt="Gradient Boosting out of bag experiment (toy dataset)" loading=lazy></a></figure><p>These results are pretty cool, but this is still just a toy dataset (though repeating the experiment with 100 different random seeds to generate different toy datasets yields similar results). The next steps would be to repeat Ridgeway&rsquo;s experiments from the GBM guide on multiple datasets to see whether the results generalise well, which will be the topic of a different post. Regardless of the final outcomes, this story illustrates the unexpected paths in which a Kaggle competition can take you. No matter what rank you end up obtaining and regardless of your skill level, there&rsquo;s always something new to learn.</p><p><strong>Update:</strong> I ran Ridgway&rsquo;s experiments. The results are discussed in <a href=https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/ title="Stochastic Gradient Boosting: Choosing the Best Number of Iterations">Stochastic Gradient Boosting: Choosing the Best Number of Iterations</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/gradient-boosting/>Gradient Boosting</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/price-forecasting/>Price Forecasting</a></li><li><a href=https://yanirseroussi.com/tags/scikit-learn/>Scikit-Learn</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on x" href="https://x.com/intent/tweet/?text=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f&amp;hashtags=datascience%2cgradientboosting%2cKaggle%2cKagglecompetition%2cpredictivemodelling%2cpriceforecasting%2cscikit-learn"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f&amp;title=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29&amp;summary=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f&title=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on whatsapp" href="https://api.whatsapp.com/send?text=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on telegram" href="https://telegram.me/share/url?text=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Fitting%20noise%3a%20Forecasting%20the%20sale%20price%20of%20bulldozers%20%28Kaggle%20competition%20summary%29&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f11%2f19%2ffitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/12/15/seo-mostly-about-showing-up/index.html b/2014/12/15/seo-mostly-about-showing-up/index.html
index 5760ffdad..a3e0b8777 100644
--- a/2014/12/15/seo-mostly-about-showing-up/index.html
+++ b/2014/12/15/seo-mostly-about-showing-up/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="BCRecommender,marketing,search engine optimisation,traction book"><meta name=description content="Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="SEO: Mostly about showing up?"><meta property="og:description" content="Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/"><meta property="og:image" content="https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/bcrecommender-search-queries.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-12-15T04:25:25+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/bcrecommender-search-queries.png"><meta name=twitter:title content="SEO: Mostly about showing up?"><meta name=twitter:description content="Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"SEO: Mostly about showing up?","item":"https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"SEO: Mostly about showing up?","name":"SEO: Mostly about showing up?","description":"Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content.","keywords":["BCRecommender","marketing","search engine optimisation","traction book"],"articleBody":"In previous posts about getting traction for my Bandcamp recommendations project (BCRecommender), I mentioned search engine optimisation (SEO) as one of the promising traction channels. Unfortunately, early efforts yielded negligible traffic – most new visitors came from referrals from blogs and Twitter. It turns out that the problem was not showing up for the SEO game: most of BCRecommender’s pages were blocked for crawling via robots.txt because I was worried that search engines (=Google) would penalise the website for thin/duplicate content.\nRecently, I beefed up most of the pages, created a sitemap, and removed most pages from robots.txt. This resulted in a significant increase in traffic, as illustrated by the above graph. The number of organic impressions went up from less than ten per day to over a thousand. This is expected to go up even further, as only about 10% of pages are indexed. In addition, some traffic went to my staging site because it wasn’t blocked from crawling (I had to set up a new staging site that is password-protected and add a redirect from the old site to the production site – a bit annoying but I couldn’t find a better solution).\nI hope Google won’t suddenly decide that BCRecommender content is not valuable or too thin. The content is automatically generated, which is “bad”, but it doesn’t “consist of paragraphs of random text that make no sense to the reader but which may contain search keywords”. As a (completely unbiased) user, I think it is valuable to find similar albums when searching for an album you like – an example that represents the majority of people that click through to BCRecommender. Judging from the main engagement measure I’m using (time spent on site), a good number of these people are happy with what they find.\nMore updates to come in the future. For now, my conclusion is: thin content is better than no content, as long as it’s relevant to what people are searching for and provides real value.\n","wordCount":"333","inLanguage":"en","image":"https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/bcrecommender-search-queries.png","datePublished":"2014-12-15T04:25:25Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">SEO: Mostly about showing up?</h1><div class=post-meta><span title='2014-12-15 04:25:25 +0000 UTC'>December 15, 2014</span></div></header><figure class=entry-cover><img loading=eager src=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/bcrecommender-search-queries.png alt></figure><div class=post-content><p>In previous posts about getting traction for my <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp recommendations project (BCRecommender)</a>, I mentioned search engine optimisation (SEO) as one of the promising traction channels. Unfortunately, early efforts yielded negligible traffic – <a href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/ title="BCRecommender Traction Update">most new visitors came from referrals from blogs and Twitter</a>. It turns out that the problem was <strong>not showing up for the SEO game</strong>: most of BCRecommender&rsquo;s pages were blocked for crawling via robots.txt because I was worried that search engines (=Google) would penalise the website for <a href="https://support.google.com/webmasters/answer/2604719?hl=en" target=_blank rel=noopener>thin/duplicate content</a>.</p><p>Recently, I beefed up most of the pages, created a sitemap, and removed most pages from robots.txt. This resulted in a significant increase in traffic, as illustrated by the above graph. The number of organic impressions went up from less than ten per day to over a thousand. This is expected to go up even further, as only about 10% of pages are indexed. In addition, some traffic went to my staging site because it wasn&rsquo;t blocked from crawling (I had to set up a new staging site that is password-protected and add a redirect from the old site to the production site – a bit annoying but I couldn&rsquo;t find a better solution).</p><p>I hope Google won&rsquo;t suddenly decide that BCRecommender content is not valuable or too thin. The content is automatically generated, which is <a href=https://support.google.com/webmasters/answer/2721306 target=_blank rel=noopener>&ldquo;bad&rdquo;</a>, but it doesn&rsquo;t &ldquo;consist of paragraphs of random text that make no sense to the reader but which may contain search keywords&rdquo;. As a (completely unbiased) user, I think it is valuable to find similar albums when searching for an album you like – an example that represents the majority of people that click through to BCRecommender. Judging from the main engagement measure I&rsquo;m using (time spent on site), a good number of these people are happy with what they find.</p><p>More updates to come in the future. For now, my conclusion is: thin content is better than no content, as long as it&rsquo;s relevant to what people are searching for and provides real value.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/search-engine-optimisation/>Search Engine Optimisation</a></li><li><a href=https://yanirseroussi.com/tags/traction-book/>Traction Book</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on x" href="https://x.com/intent/tweet/?text=SEO%3a%20Mostly%20about%20showing%20up%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f&amp;hashtags=BCRecommender%2cmarketing%2csearchengineoptimisation%2ctractionbook"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f&amp;title=SEO%3a%20Mostly%20about%20showing%20up%3f&amp;summary=SEO%3a%20Mostly%20about%20showing%20up%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f&title=SEO%3a%20Mostly%20about%20showing%20up%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on whatsapp" href="https://api.whatsapp.com/send?text=SEO%3a%20Mostly%20about%20showing%20up%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on telegram" href="https://telegram.me/share/url?text=SEO%3a%20Mostly%20about%20showing%20up%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share SEO: Mostly about showing up? on ycombinator" href="https://news.ycombinator.com/submitlink?t=SEO%3a%20Mostly%20about%20showing%20up%3f&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f15%2fseo-mostly-about-showing-up%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/index.html b/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/index.html
index efdba0b2e..d273a35f3 100644
--- a/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/index.html
+++ b/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/index.html
@@ -9,7 +9,7 @@
 https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/gradient-boosting-out-of-bag-experiments-uci-datasets.png 591w," src=https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/gradient-boosting-out-of-bag-experiments-uci-datasets.png alt="Gradient Boosting out of bag experiment (UCI datasets)" loading=lazy></a></figure><p>The following table shows the raw data that was used to produce the figure.</p><table><thead><tr><th>Dataset</th><th>CV</th><th>SKO</th><th>TSO</th></tr></thead><tbody><tr><td>creditrating</td><td>0.9962</td><td>0.9771</td><td>1</td></tr><tr><td>breastcancer</td><td>1</td><td>0.6675</td><td>0.4869</td></tr><tr><td>mushrooms</td><td>0.9588</td><td>0.9963</td><td>1</td></tr><tr><td>abalone</td><td>1</td><td>0.9754</td><td>0.9963</td></tr><tr><td>ionosphere</td><td>0.9919</td><td>1</td><td>0.8129</td></tr><tr><td>diabetes</td><td>1</td><td>0.9869</td><td>0.9985</td></tr><tr><td>autoprices</td><td>1</td><td>0.9565</td><td>0.5839</td></tr><tr><td>autompg</td><td>1</td><td>0.8753</td><td>0.9948</td></tr><tr><td>bostonhousing</td><td>1</td><td>0.8299</td><td>0.5412</td></tr><tr><td>haberman</td><td>1</td><td>0.9793</td><td>0.9266</td></tr><tr><td>cpuperformance</td><td>0.9934</td><td>0.9160</td><td>1</td></tr><tr><td>adult</td><td>1</td><td>0.9824</td><td>0.9991</td></tr></tbody></table><p>The main finding is that CV remains the most reliable approach. Even when CV is not the best-performing method, it&rsquo;s not much worse than the best method (this is in line with Ridgeway&rsquo;s findings). TSO yielded the best results on 3/12 of the datasets, and beat SKO 7/12 times. However, TSO&rsquo;s results are the most variant of the three methods: when it fails, it often yields very poor results.</p><p>In conclusion, stick to cross-validation for the best results. It&rsquo;s more computationally intensive than SKO and TSO, but can be parallelised. I still think that there may be a way to avoid cross-validation, perhaps by extending SKO/TSO in more intelligent ways (see some interesting ideas by Eugene Dubossarsky <a href=http://cavemoosum.blogspot.com.au/2014/02/cross-validation-is-over-long-live.html target=_blank rel=noopener>here</a> and <a href=http://cavemoosum.blogspot.com.au/2014/03/cross-validation-is-not-quite-kaput-but.html target=_blank rel=noopener>here</a>). Any comments/ideas are very welcome.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/gradient-boosting/>Gradient Boosting</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/scikit-learn/>Scikit-Learn</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on x" href="https://x.com/intent/tweet/?text=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f&amp;hashtags=datascience%2cgradientboosting%2cmachinelearning%2cpredictivemodelling%2cscikit-learn"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f&amp;title=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations&amp;summary=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations&amp;source=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f&title=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on whatsapp" href="https://api.whatsapp.com/send?text=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations%20-%20https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on telegram" href="https://telegram.me/share/url?text=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations&amp;url=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Stochastic Gradient Boosting: Choosing the Best Number of Iterations on ycombinator" href="https://news.ycombinator.com/submitlink?t=Stochastic%20Gradient%20Boosting%3a%20Choosing%20the%20Best%20Number%20of%20Iterations&u=https%3a%2f%2fyanirseroussi.com%2f2014%2f12%2f29%2fstochastic-gradient-boosting-choosing-the-best-number-of-iterations%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/01/15/automating-parse-com-bulk-data-imports/index.html b/2015/01/15/automating-parse-com-bulk-data-imports/index.html
index 32088aca6..2ce64808b 100644
--- a/2015/01/15/automating-parse-com-bulk-data-imports/index.html
+++ b/2015/01/15/automating-parse-com-bulk-data-imports/index.html
@@ -4,7 +4,7 @@
 </span></span></code></pre></div><p><a href=https://gist.github.com/yanirs/eddedf152f42c1ee02b2 target=_blank rel=noopener>See the script&rsquo;s source</a> for a detailed explanation of the command-line arguments.</p><p>It is worth noting that the script doesn&rsquo;t do any post-upload verification on the collection. This is done by an extra bit of Python code that verifies that the collection has the expected number of objects, and tries to query the collection sorted by all the keys that are supposed to be indexed (for large collections, it takes Parse a while to index all the fields, which may result in timeouts). Once these conditions are fulfilled, the Parse hosting code is updated to point to the new collection. For security, I added a bot user that has access only to the Parse app that it needs to update. Unlike the root user, this bot user can&rsquo;t delete the app. As the config file contains the bot&rsquo;s password, it should be encrypted and stored in a safe place (<a href=https://parse.com/docs/data#security target=_blank rel=noopener>like the Parse master key</a>).</p><p>That&rsquo;s it! I hope that other people would find this solution useful. Any suggestions/comments/issues are very welcome.</p><p><small><br>Image source: <a href=http://blog.parse.com/2013/05/07/goodbye-web-servers-hello-parse-hosting/ target=_blank rel=noopener>Parse Blog</a>.<br></small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/parse.com/>Parse.com</a></li><li><a href=https://yanirseroussi.com/tags/phantomjs/>PhantomJS</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on x" href="https://x.com/intent/tweet/?text=Automating%20Parse.com%20bulk%20data%20imports&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f&amp;hashtags=DevOps%2cparse.com%2cPhantomJS%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f&amp;title=Automating%20Parse.com%20bulk%20data%20imports&amp;summary=Automating%20Parse.com%20bulk%20data%20imports&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f&title=Automating%20Parse.com%20bulk%20data%20imports"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on whatsapp" href="https://api.whatsapp.com/send?text=Automating%20Parse.com%20bulk%20data%20imports%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on telegram" href="https://telegram.me/share/url?text=Automating%20Parse.com%20bulk%20data%20imports&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Automating Parse.com bulk data imports on ycombinator" href="https://news.ycombinator.com/submitlink?t=Automating%20Parse.com%20bulk%20data%20imports&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f15%2fautomating-parse-com-bulk-data-imports%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-556><div class=comment-header><a href=#comment-556><img class=comment-avatar src="https://www.gravatar.com/avatar/6fcd5405112d9893195e7c3fa29a5715?s=50"><p class=comment-info><strong>Walter</strong><br><small>2015-07-30 08:22:30</small></p></a></div><div class="comment-body post-content">Hi, very nice trick! Trying to implement this as we speak, does this code still work? I get to the collections page, but I don&rsquo;t think the upload is working. I&rsquo;m new to Phantomjs.
 Thanks!</div></div><div class=comment-level-1 id=comment-557><div class=comment-header><a href=#comment-557><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-07-30 21:20:45</small></p></a></div><div class="comment-body post-content">Hi Walter! Yeah, the code stopped working when Parse redesigned their website. I never fixed it because I ended up porting my projects away from Parse. If you fix it let me know and I&rsquo;ll update this post.
 By the way, you may find it easier to use Selenium (or something similar) as a wrapper around PhantomJS, as it should result in cleaner code. For example, check out Python&rsquo;s Selenium bindings: <a href=http://selenium.googlecode.com/svn/trunk/docs/api/py/index.html target=_blank rel=noopener>http://selenium.googlecode.com/svn/trunk/docs/api/py/index.html</a></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
diff --git a/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/index.html b/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/index.html
index 6f4a16a2d..50713f682 100644
--- a/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/index.html
+++ b/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/index.html
@@ -5,7 +5,7 @@
 </span></span></code></pre></div><p>These records describe the session (<code>SessionID</code> = 744899) of the user with <code>USERID</code> 123123123, performed on the 23rd day of the dataset. The user submitted the query with <code>QUERYID</code> 192902, which contains terms with <code>TermIDs</code> 4857,3847,2939. The URL with <code>URLID</code> 632428 placed on the domain <code>DomainID</code> 2384 is the top result on the corresponding SERP. 1403 units of time after beginning of the session the user clicked on the result with <code>URLID</code> 632428 (ranked first in the list).</p></blockquote><p>While this may seem daunting at first, the data is actually quite simple. For each search session, we know the user, the queries they&rsquo;ve made, which URLs and domains were returned in the SERP (search engine result page), which results they&rsquo;ve clicked, and at what point in time the queries and clicks happened.</p><h3 id=goal-and-evaluation>Goal and evaluation<a hidden class=anchor aria-hidden=true href=#goal-and-evaluation>#</a></h3><p>The goal of the competition is to rerank the results in each SERP such that the highest-ranking documents are those that the user would find most relevant. As the name of the competition suggests, personalising the results is key, but non-personalised approaches were also welcome (and actually worked quite well).</p><p>One question that arises is how to tell from this data which results the user finds relevant. In this competition, the results were labelled as either irrelevant (0), relevant (1), or highly relevant (2). Relevance is a function of clicks and dwell time, where dwell time is the time spent on the result (determined by the time that passed until the next query or click). Irrelevant results are ones that weren&rsquo;t clicked, or those for which the dwell time is less than 50 (the time unit is left unspecified). Relevant results are those that were clicked and have dwell time of 50 to 399. Highly relevant results have dwell time of at least 400, or were clicked as the last action in the session (i.e., it is assumed the user finished the session satisfied with the results rather than left because they couldn&rsquo;t find what they were looking for).</p><p>This approach to determining relevance has some obvious flaws, but <a href=https://www.kaggle.com/c/yandex-personalized-web-search-challenge/details/evaluation target=_blank rel=noopener>it apparently correlates well with actual user satisfaction with search results</a>.</p><p>Given the above definition of relevance, one can quantify how well a reranking method improves the relevance of the results. For this competition, the organisers chose the <a href=https://en.wikipedia.org/wiki/Discounted_cumulative_gain target=_blank rel=noopener>normalised discounted cumulative gain (NDCG) measure</a>, which is a fancy name for a measure that, in the words of Wikipedia, encodes the assumptions that:</p><ul><li>Highly relevant documents are more useful when appearing earlier in a search engine result list (have higher ranks)</li><li>Highly relevant documents are more useful than marginally relevant documents, which are in turn more useful than irrelevant documents.</li></ul><h3 id=seo-insights-and-other-thoughts>SEO insights and other thoughts<a hidden class=anchor aria-hidden=true href=#seo-insights-and-other-thoughts>#</a></h3><p>A key insight that is relevant to SEO and privacy, is that even without considering browser-based tracking and tools like Google Analytics (which may or may not be used by Google to rerank search results), search engines can infer a lot about user behaviour on other sites, just based on user interaction with the SERP. So if your users bounce quickly because your website is slow to load or ranks highly for irrelevant queries, the search engine can know that, and will probably penalise you accordingly.</p><p>This works both ways, though, and is evident even on <a href=http://donttrack.us/ target=_blank rel=noopener>search engines that don&rsquo;t track personal information</a>. Just try searching for &ldquo;f&rdquo; or &ldquo;fa&rdquo; or &ldquo;fac&rdquo; using DuckDuckGo, Google, Bing, Yahoo, or even Yandex. Facebook will be one of the top results (most often the first one), probably just because people tend to search for or visit Facebook after searching for one of those terms by mistake. So if your website ranks poorly for a term for which it should rank well, and your users behave accordingly (because, for example, they&rsquo;re searching for your website specifically), you may magically end up with better ranking without any changes to inbound links or to your site.</p><p>Another thing that is demonstrated by this competition&rsquo;s dataset is just how much data search engines consider when determining ranking. The dataset is just a sample of logs for one city for one month. I don&rsquo;t like throwing the words &ldquo;big data&rdquo; around, but the full volume of data is pretty big. Too big for anyone to grasp and fully understand how exactly search engines work, and this includes the people who build them. What&rsquo;s worth keeping in mind is that for all major search engines, the user is the product that they sell to advertisers, so keeping the users happy is key. Any changes made to the underlying algorithms are usually done with the end-user in mind, because not making such changes may kill the search engine (remember AltaVista?). Further, personalisation means that <a href=http://dontbubble.us/ target=_blank rel=noopener>different users see different results for the same query</a>. So my feeling is that it&rsquo;s somewhat futile to do any SEO beyond making the website understandable by search engines, acquiring legitimate links, and just building a website that people would want to visit.</p><h3 id=next-steps>Next steps<a hidden class=anchor aria-hidden=true href=#next-steps>#</a></h3><p>With those thoughts out of the way, it&rsquo;s time to describe the way we addressed the challenge. This is covered in the next post, <a href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/ title="Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)">Learning to rank for personalised search</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/search-engine-optimisation/>Search Engine Optimisation</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on x" href="https://x.com/intent/tweet/?text=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f&amp;hashtags=datascience%2cKaggle%2cKagglecompetition%2cmachinelearning%2cpredictivemodelling%2csearchengineoptimisation"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f&amp;title=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29&amp;summary=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f&title=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on whatsapp" href="https://api.whatsapp.com/send?text=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on telegram" href="https://telegram.me/share/url?text=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Is%20thinking%20like%20a%20search%20engine%20possible%3f%20%28Yandex%20search%20personalisation%20%e2%80%93%20Kaggle%20competition%20summary%20%e2%80%93%20part%201%29&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f01%2f29%2fis-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/index.html b/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/index.html
index 042d4d712..0ab7c3bab 100644
--- a/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/index.html
+++ b/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,gradient boosting,Kaggle,Kaggle competition,machine learning,predictive modelling,search engine optimisation"><meta name=description content="My team&rsquo;s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams)."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)"><meta property="og:description" content="My team&rsquo;s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams)."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/"><meta property="og:image" content="https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/rating.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2015-02-11T06:34:17+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/rating.png"><meta name=twitter:title content="Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)"><meta name=twitter:description content="My team&rsquo;s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams)."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)","item":"https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)","name":"Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)","description":"My team\u0026rsquo;s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams).","keywords":["data science","gradient boosting","Kaggle","Kaggle competition","machine learning","predictive modelling","search engine optimisation"],"articleBody":"This is the second and last post summarising my team’s solution for the Yandex search personalisation Kaggle competition. See the first post for a summary of the dataset, evaluation approach, and some thoughts about search engine optimisation and privacy. This post discusses the algorithms and features we used.\nTo quickly recap the first post, Yandex released a 16GB dataset of query \u0026 click logs. The goal of the competition was to use this data to rerank query results such that the more relevant results appear before less relevant results. Relevance is determined by time spent on each clicked result (non-clicked results are deemed irrelevant), and overall performance is scored using the normalised discounted cumulative gain (NDCG) measure. No data about the content of sites or queries was given – each query in the dataset is a list of token IDs and each result is a (url ID, domain ID) pair.\nFirst steps: memory-based heuristics My initial approach wasn’t very exciting: it involved iterating through the data, summarising it in one way or another, and assigning new relevance scores to each (user, session, query) combination. In this early stage I also implemented an offline validation framework, which is an important part of every Kaggle competition: in this case I simply set aside the last three days of data for local testing, because the test dataset that was used for the leaderboard consisted of three days of log data.\nSomewhat surprisingly, my heuristics worked quite well and put me in a top-10 position on the leaderboard. It seems like the barrier of entry for this competition was higher than for other Kaggle competitions due to the size of the data and the fact that it wasn’t given as preprocessed feature vectors. This was evident from questions on the forum, where people noted that they were having trouble downloading and looking at the data.\nThe heuristic models that worked well included:\nReranking based on mean relevance (this just swapped positions 9 \u0026 10, probably because users are more likely to click the last result) Reranking based on mean relevance for (query, url) and (query, domain) pairs (non-personalised improvements) Downranking urls observed previously in a session Each one of the heuristic models was set to output relevance scores. The models were then ensembled by simply summing the relevance scores.\nThen, I started playing with a collaborative-filtering-inspired matrix factorisation model for predicting relevance, which didn’t work too well. At around that time, I got too busy with other stuff and decided to quit while I’m ahead.\nGetting more serious with some team work and LambdaMART A few weeks after quitting, I somehow volunteered to organise Kaggle teams for newbies at the Sydney Data Science Meetup group. At that point I was joined by my teammates, which served as a good motivation to do more stuff.\nThe first thing we tried was another heuristic model I read about in one of the papers suggested by the organisers: just reranking based on the fact that people often repeat queries as a navigational aid (e.g., search for Facebook and click Facebook). Combined in a simple linear model with the other heuristics, this put us at #4. Too easy 🙂\nWith all the new motivation, it was time to read more papers and start doing things properly. We ended up using Ranklib’s LambdaMART implementation as one of our main models, and also used LambdaMART to combine the various models (the old heuristics still helped the overall score, as did the matrix factorisation model).\nUsing LambdaMART made it possible to directly optimise the NDCG measure, turning the key problem into feature engineering, i.e., finding good features to feed into the model. Explaining how LambdaMART works is beyond the scope of this post (see this paper for an in-depth discussion), but the basic idea (which is also shared by other learning to rank algorithms) is that rather than trying to solve the hard problem of predicting relevance (i.e., a regression problem), the algorithm tries to predict the ranking that yields the best results according to a user-chosen measure.\nWe tried many features for the LambdaMART model, but after feature selection (using a method learned from Phil Brierley’s talk) the best features turned out to be:\npercentage_recurrent_term_ids: percentage of term IDs from the test query that appeared previously in the session — indicates if this query refines previous queries query_mean_ndcg: historical NDCG for this query — indicates how satisfied people are with the results of this query. Interestingly, we also tried query click entropy, but it performed worse. Probably because we’re optimising the NDCG rather than click-through rate. query_num_unique_serps: how many different result pages were shown for this query query_mean_result_dwell_time: how much time on average people spend per result for this query user_mean_ndcg: like query_mean_ndcg, but for users — a low NDCG indicates that this user is likely to be dissatisfied with the results. As for query_mean_ndcg, adding this feature yielded better results than using the user’s click entropy. user_num_click_actions_with_relevance_0: over the history of this user, how many of their clicks had relevance 0 (i.e., short dwell time). Interestingly, user_num_click_actions_with_relevance_1 and user_num_click_actions_with_relevance_2 were found to be less useful. user_num_query_actions: number of queries performed by the user rank: the original rank, as assigned by Yandex previous_query_url_relevance_in_session: modelling repeated results within a session, e.g., if a (query, url) pair was already found irrelevant in this session, the user may not want to see it again previous_url_relevance_in_session: the same as previous_query_url_relevance_in_session, but for a url regardless of the query user_query_url_relevance_sum: over the entire history of the user, not just the session user_normalised_rank_relevance: how relevant does the user usually find this rank? The idea is that some people are more likely to go through all the results than others query_url_click_probability: estimated simply as num_query_url_clicks / num_query_url_occurrences (across all the users) average_time_on_page: how much time people spend on this url on average Our best submission ended up placing us at the 9th place (out of 194 teams), which is respectable. Things got a bit more interesting towards the end of the competition – if we had used the original heuristic model that put at #4 early on, we would have finished 18th.\nConclusion I really enjoyed this competition. The data was well-organised and well-defined, which is not something you get in every competition (or in “real life”). Its size did present some challenges, but we stuck to using flat files and some preprocessing and other tricks to speed things up (e.g., I got to use Cython for the first time). It was good to learn how learning to rank algorithms work and get some insights on search personalisation. As is often the case with Kaggle competitions, this was time well spent.\n","wordCount":"1114","inLanguage":"en","image":"https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/rating.png","datePublished":"2015-02-11T06:34:17Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)</h1><div class=post-meta><span title='2015-02-11 06:34:17 +0000 UTC'>February 11, 2015</span></div></header><figure class=entry-cover><img loading=eager src=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/rating.png alt></figure><div class=post-content><p>This is the second and last post summarising my team&rsquo;s solution for the <a href=https://www.kaggle.com/c/yandex-personalized-web-search-challenge target=_blank rel=noopener>Yandex search personalisation Kaggle competition</a>. <a href=https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/ title="Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)">See the first post</a> for a summary of the dataset, evaluation approach, and some thoughts about search engine optimisation and privacy. This post discusses the algorithms and features we used.</p><p>To quickly recap the <a href=https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/ title="Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)">first post</a>, Yandex released a 16GB dataset of query & click logs. The goal of the competition was to use this data to rerank query results such that the more relevant results appear before less relevant results. Relevance is determined by time spent on each clicked result (non-clicked results are deemed irrelevant), and overall performance is scored using the <a href=https://en.wikipedia.org/wiki/Discounted_cumulative_gain target=_blank rel=noopener>normalised discounted cumulative gain (NDCG) measure</a>. No data about the content of sites or queries was given – each query in the dataset is a list of token IDs and each result is a (url ID, domain ID) pair.</p><h3 id=first-steps-memory-based-heuristics>First steps: memory-based heuristics<a hidden class=anchor aria-hidden=true href=#first-steps-memory-based-heuristics>#</a></h3><p>My initial approach wasn&rsquo;t very exciting: it involved iterating through the data, summarising it in one way or another, and assigning new relevance scores to each (user, session, query) combination. In this early stage I also implemented an offline validation framework, <a href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/ title="How to (almost) win Kaggle competitions">which is an important part of every Kaggle competition</a>: in this case I simply set aside the last three days of data for local testing, because the test dataset that was used for the leaderboard consisted of three days of log data.</p><p>Somewhat surprisingly, my heuristics worked quite well and put me in a top-10 position on the leaderboard. It seems like the barrier of entry for this competition was higher than for other Kaggle competitions due to the size of the data and the fact that it wasn&rsquo;t given as preprocessed feature vectors. This was evident from questions on the forum, where people noted that they were having trouble downloading and looking at the data.</p><p>The heuristic models that worked well included:</p><ul><li>Reranking based on mean relevance (this just swapped positions 9 & 10, probably because users are more likely to click the last result)</li><li>Reranking based on mean relevance for (query, url) and (query, domain) pairs (non-personalised improvements)</li><li>Downranking urls observed previously in a session</li></ul><p>Each one of the heuristic models was set to output relevance scores. The models were then ensembled by simply summing the relevance scores.</p><p>Then, I started playing with a <a href=https://en.wikipedia.org/wiki/Collaborative_filtering target=_blank rel=noopener>collaborative-filtering</a>-inspired matrix factorisation model for predicting relevance, which didn&rsquo;t work too well. At around that time, I got too busy with other stuff and decided to quit while I&rsquo;m ahead.</p><h3 id=getting-more-serious-with-some-team-work-and-lambdamart>Getting more serious with some team work and LambdaMART<a hidden class=anchor aria-hidden=true href=#getting-more-serious-with-some-team-work-and-lambdamart>#</a></h3><p>A few weeks after quitting, I somehow volunteered to organise Kaggle teams for newbies at the <a href=http://www.meetup.com/Data-Science-Sydney/ target=_blank rel=noopener>Sydney Data Science Meetup group</a>. At that point I was joined by my teammates, which served as a good motivation to do more stuff.</p><p>The first thing we tried was another heuristic model I read about in one of the <a href=https://www.kaggle.com/c/yandex-personalized-web-search-challenge/details/related-papers target=_blank rel=noopener>papers suggested by the organisers</a>: just reranking based on the fact that people often repeat queries as a navigational aid (e.g., search for Facebook and click Facebook). Combined in a simple linear model with the other heuristics, this put us at #4. Too easy 🙂</p><p>With all the new motivation, it was time to read more papers and start doing things properly. We ended up using <a href=http://sourceforge.net/p/lemur/wiki/RankLib/ target=_blank rel=noopener>Ranklib&rsquo;s LambdaMART implementation</a> as one of our main models, and also used LambdaMART to combine the various models (the old heuristics still helped the overall score, as did the matrix factorisation model).</p><p>Using LambdaMART made it possible to directly optimise the NDCG measure, turning the key problem into feature engineering, i.e., finding good features to feed into the model. Explaining how LambdaMART works is beyond the scope of this post (<a href=http://research.microsoft.com/pubs/132652/MSR-TR-2010-82.pdf target=_blank rel=noopener>see this paper for an in-depth discussion</a>), but the basic idea (which is also shared by other <a href=https://en.wikipedia.org/wiki/Learning_to_rank target=_blank rel=noopener>learning to rank</a> algorithms) is that rather than trying to solve the hard problem of predicting relevance (i.e., a regression problem), the algorithm tries to predict the ranking that yields the best results according to a user-chosen measure.</p><p>We tried many features for the LambdaMART model, but after feature selection (using a method learned from <a href=http://anotherdataminingblog.blogspot.com.au/2013/10/techniques-to-improve-accuracy-of-your_17.html target=_blank rel=noopener>Phil Brierley&rsquo;s talk</a>) the best features turned out to be:</p><ul><li>percentage_recurrent_term_ids: percentage of term IDs from the test query that appeared previously in the session — indicates if this query refines previous queries</li><li>query_mean_ndcg: historical NDCG for this query — indicates how satisfied people are with the results of this query. Interestingly, we also tried query click entropy, but it performed worse. Probably because we&rsquo;re optimising the NDCG rather than click-through rate.</li><li>query_num_unique_serps: how many different result pages were shown for this query</li><li>query_mean_result_dwell_time: how much time on average people spend per result for this query</li><li>user_mean_ndcg: like query_mean_ndcg, but for users — a low NDCG indicates that this user is likely to be dissatisfied with the results. As for query_mean_ndcg, adding this feature yielded better results than using the user&rsquo;s click entropy.</li><li>user_num_click_actions_with_relevance_0: over the history of this user, how many of their clicks had relevance 0 (i.e., short dwell time). Interestingly, user_num_click_actions_with_relevance_1 and user_num_click_actions_with_relevance_2 were found to be less useful.</li><li>user_num_query_actions: number of queries performed by the user</li><li>rank: the original rank, as assigned by Yandex</li><li>previous_query_url_relevance_in_session: modelling repeated results within a session, e.g., if a (query, url) pair was already found irrelevant in this session, the user may not want to see it again</li><li>previous_url_relevance_in_session: the same as previous_query_url_relevance_in_session, but for a url regardless of the query</li><li>user_query_url_relevance_sum: over the entire history of the user, not just the session</li><li>user_normalised_rank_relevance: how relevant does the user usually find this rank? The idea is that some people are more likely to go through all the results than others</li><li>query_url_click_probability: estimated simply as num_query_url_clicks / num_query_url_occurrences (across all the users)</li><li>average_time_on_page: how much time people spend on this url on average</li></ul><p>Our best submission ended up placing us at the 9th place (out of 194 teams), which is respectable. Things got a bit more interesting towards the end of the competition – if we had used the original heuristic model that put at #4 early on, we would have finished 18th.</p><h3 id=conclusion>Conclusion<a hidden class=anchor aria-hidden=true href=#conclusion>#</a></h3><p>I really enjoyed this competition. The data was well-organised and well-defined, which is not something you get in every competition (or in &ldquo;real life&rdquo;). Its size did present some challenges, but we stuck to using flat files and some preprocessing and other tricks to speed things up (e.g., I got to use <a href=http://cython.org/ target=_blank rel=noopener>Cython</a> for the first time). It was good to learn how learning to rank algorithms work and get some insights on search personalisation. As is often the case with Kaggle competitions, this was time well spent.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/gradient-boosting/>Gradient Boosting</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/search-engine-optimisation/>Search Engine Optimisation</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on x" href="https://x.com/intent/tweet/?text=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f&amp;hashtags=datascience%2cgradientboosting%2cKaggle%2cKagglecompetition%2cmachinelearning%2cpredictivemodelling%2csearchengineoptimisation"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f&amp;title=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29&amp;summary=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f&title=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on whatsapp" href="https://api.whatsapp.com/send?text=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on telegram" href="https://telegram.me/share/url?text=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Learning%20to%20rank%20for%20personalised%20search%20%28Yandex%20Search%20Personalisation%20%e2%80%93%20Kaggle%20Competition%20Summary%20%e2%80%93%20Part%202%29&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f02%2f11%2flearning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-2052><div class=comment-header><a href=#comment-2052><img class=comment-avatar src="https://www.gravatar.com/avatar/1fc0fb275ad25219b5c017921c5e71ec?s=50"><p class=comment-info><strong>nitin</strong><br><small>2017-12-17 13:02:37</small></p></a></div><div class="comment-body post-content"><p>I do not understand how your featureset helped the model to learn anything.
 For example,
 user_num_query_actions: number of queries performed by the user</p><p>How will it affect the order of search results for a new/test query.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
diff --git a/2015/03/22/the-long-road-to-a-lifestyle-business/index.html b/2015/03/22/the-long-road-to-a-lifestyle-business/index.html
index b3a1aded0..5b0c6ffe1 100644
--- a/2015/03/22/the-long-road-to-a-lifestyle-business/index.html
+++ b/2015/03/22/the-long-road-to-a-lifestyle-business/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,data science,personal"><meta name=description content="Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The long road to a lifestyle business"><meta property="og:description" content="Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/"><meta property="og:image" content="https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2015-03-22T09:43:47+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track.jpg"><meta name=twitter:title content="The long road to a lifestyle business"><meta name=twitter:description content="Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"The long road to a lifestyle business","item":"https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The long road to a lifestyle business","name":"The long road to a lifestyle business","description":"Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects.","keywords":["business","data science","personal"],"articleBody":"Almost a year ago, I left my last full-time job and decided to set on an independent path that includes data science consulting and work on my own projects. The ultimate goal is not to have to sell my time for money by generating enough passive income to live comfortably. My five main areas of focus are – in no particular order – personal branding \u0026 networking, data science contracting, Bandcamp Recommender, Price Dingo, and marine conservation. This post summarises what I’ve been doing in each of these five areas, including highlights and lowlights. So far, it’s way better than having a “real” job. I hope this post will help others who are on a similar journey (there seem to be more and more of us – I’d love to hear from you).\nPersonal branding \u0026 networking Finding clients requires considerably more work than finding a full-time job. As with job hunting, the ideal situation is where people come to you for help, rather than you chasing them. To this end, I’ve been networking a lot, giving talks, writing up posts and working on distributing them. It may be harder than getting a full-time job, but it’s also much more interesting.\nHighlights: going viral in China, getting a post featured in KDNuggets\nLowlights: not having enough time to write all the things and meet all the people\nData science contracting My goal with contracting/consulting is to have a steady income stream while working on my own projects. As my projects are small enough to be done only by me (with optional outsourcing to contractors), this means I have infinite runway to pursue them. While this is probably not the best way of building a Silicon Valley-style startup that is going to make the world a better place, many others have applied this approach to building a so-called lifestyle business, which is what I want to achieve.\nEarly on, I realised that doing full-on consulting would be too time consuming, as many clients expect full-time availability. In addition, constantly needing to find new clients means that not much time would be left for work on my own projects. What I really wanted was a stable part-time gig. The first one was with GetUp (who reached out to me following a workshop I gave at General Assembly), where I did some work on forecasting engagement and churn. In parallel, I went through the interview process at DuckDuckGo, which included delivering a piece of work to production. DuckDuckGo ended up wanting me to work full-time (like a few other companies), so last month I started a part-time (three days a week) contract at Commonwealth Bank. I joined a team of very strong data scientists – it looks like it’s going to be interesting.\nHighlights: seeing my DuckDuckGo work every time I search for a Python package, the work environment at GetUp\nLowlights: chasing leads that never eventuated\nBandcamp Recommender (BCRecommender) I’ve written a several posts about BCRecommender, my Bandcamp music recommendation project. While I’ve always treated it as a side-project, it’s been useful in learning how to get traction for a product. It now has thousands of monthly users, and is still growing. My goal for BCRecommender has changed from the original one of finding music for myself to growing it enough to be a noticeable source of traffic for Bandcamp, thereby helping artists and fans. Doing it in side-project mode can be a bit challenging at times (because I have so many other things to do and a long list of ideas to make the app better), but I’ve been making gradual progress and discovering a lot of great music in the process.\nHighlights: every time someone gives me positive feedback, every time I listen to music I found using BCRecommender\nLowlights: dealing with Parse issues and random errors\nPrice Dingo The inability to reliably compare prices for many types of products has been bothering me for a while. Unlike general web search, where the main providers rank results by relevance, most Australian price comparison engines still require merchants to pay to even have their products listed. This creates an obvious bias in the results. To address this bias, I created Price Dingo – a user-centric price comparison engine. It serves users with results they can trust by not requiring merchants to pay to have their products listed. Just like general web search engines, the main ranking factor is relevancy to the user. This relevancy is also achieved by implementing Price Dingo as a network of independent sites, each focused on a specific product category, with the first category being scuba diving gear.\nImplementing Price Dingo hasn’t been too hard – the main challenge has been finding the time to do it with all the other stuff I’ve been doing. There are still plenty of improvements to be made to the site, but now the main goal is to get enough traction to make ongoing time investment worthwhile. Judging by the experience of Booko’s founder, there is space in the market for niche price comparison sites and apps, so it is just a matter of execution.\nHighlights: being able to finally compare dive gear prices, the joys of integrating Algolia\nLowlights: extracting data from messy websites – I’ve seen some horrible things…\nMarine conservation The first thing I did after leaving my last job was go overseas for five weeks, which included a ten-day visit to Israel (rockets!) and three weeks of conservation diving with New Heaven Dive School in Thailand. Back in Sydney, I joined the Underwater Research Group of NSW, a dive club that’s involved in many marine conservation and research activities, including Reef Life Survey (RLS) and underwater cleanups. With URG, I’ve been diving more than before, and for a change, some of my dives actually do good. I’d love to do this kind of stuff full-time, but there’s a lot less money in getting people to do less stuff (i.e., conservation and sustainability) than in consuming more. The compromise for now is that a portion of Price Dingo’s scuba revenue goes to the Australian Marine Conservation Society, and the plan is to expand this to other charities as more categories are added. Update – May 2015: I decided that this compromise isn’t good enough for me, so I shut down Price Dingo to focus on projects that are more aligned with my values.\nHighlights: becoming a certified RLS diver, pretty much every dive\nLowlights: cutting my hand open by falling on rocks on the first day of diving in Thailand\nThe future So far, I’m pretty happy with this not-having-a-job-doing-my-own-thing business. According to The 1000 Day Rule, I still have a long way to go until I get the lifestyle I want. It may even take longer than 1000 days given my decision to not work full-time on a single profitable project, together with my tendency to take more time off than I would if I had a “real” job. But the beauty of this path is that there are no investors breathing down my neck or the feeling of mental rot that comes with a full-time job, so there’s really no rush and I can just enjoy the ride.\n","wordCount":"1202","inLanguage":"en","image":"https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track.jpg","datePublished":"2015-03-22T09:43:47Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">The long road to a lifestyle business</h1><div class=post-meta><span title='2015-03-22 09:43:47 +0000 UTC'>March 22, 2015</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track_hu48d06ef732b295416c5a71b75238e67b_1361225_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track_hu48d06ef732b295416c5a71b75238e67b_1361225_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track_hu48d06ef732b295416c5a71b75238e67b_1361225_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track_hu48d06ef732b295416c5a71b75238e67b_1361225_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track_hu48d06ef732b295416c5a71b75238e67b_1361225_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track.jpg 3450w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/overland-track.jpg alt width=3450 height=1730></figure><div class=post-content><p>Almost a year ago, I left my last full-time job and decided to set on an independent path that includes data science consulting and work on my own projects. The ultimate goal is not to <em>have</em> to sell my time for money by generating enough passive income to live comfortably. My five main areas of focus are – in no particular order – personal branding & networking, data science contracting, <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp Recommender</a>, Price Dingo, and marine conservation. This post summarises what I&rsquo;ve been doing in each of these five areas, including highlights and lowlights. So far, it&rsquo;s way better than having a &ldquo;real&rdquo; job. I hope this post will help others who are on a similar journey (there seem to be more and more of us – I&rsquo;d love to hear from you).</p><h3 id=personal-branding--networking>Personal branding & networking<a hidden class=anchor aria-hidden=true href=#personal-branding--networking>#</a></h3><p>Finding clients requires considerably more work than finding a full-time job. As with job hunting, the ideal situation is where people come to you for help, rather than you chasing them. To this end, I&rsquo;ve been networking a lot, giving talks, writing up posts and working on distributing them. It may be harder than getting a full-time job, but it&rsquo;s also much more interesting.</p><p><strong>Highlights:</strong> <a href=http://www.weibo.com/1497035431/BDl53rXDk target=_blank rel=noopener>going viral in China</a>, <a href=http://www.kdnuggets.com/2015/03/10-steps-success-kaggle-data-science-competitions.html target=_blank rel=noopener>getting a post featured in KDNuggets</a><br><strong>Lowlights:</strong> not having enough time to write all the things and meet all the people</p><h3 id=data-science-contracting>Data science contracting<a hidden class=anchor aria-hidden=true href=#data-science-contracting>#</a></h3><p>My goal with contracting/consulting is to have a steady income stream while working on my own projects. As my projects are small enough to be done only by me (with optional outsourcing to contractors), this means I have infinite runway to pursue them. While this is probably not the best way of building a Silicon Valley-style startup that is going to <a href="https://www.youtube.com/watch?v=J-GVd_HLlps" target=_blank rel=noopener>make the world a better place</a>, many others have applied this approach to building a so-called lifestyle business, which is what I want to achieve.</p><p>Early on, I realised that doing full-on consulting would be too time consuming, as many clients expect full-time availability. In addition, constantly needing to find new clients means that not much time would be left for work on my own projects. What I really wanted was a stable part-time gig. The first one was with <a href=https://www.getup.org.au/ target=_blank rel=noopener>GetUp</a> (who reached out to me following a workshop I gave at <a href=https://generalassemb.ly/education/demystifying-data-an-introduction-to-data-science target=_blank rel=noopener>General Assembly</a>), where I did some work on forecasting engagement and churn. In parallel, I went through the interview process at <a href=https://duckduckgo.com/ target=_blank rel=noopener>DuckDuckGo</a>, which included <a href=https://github.com/duckduckgo/zeroclickinfo-fathead/pull/95 target=_blank rel=noopener>delivering a piece of work to production</a>. DuckDuckGo ended up wanting me to work full-time (like a few other companies), so last month I started a part-time (three days a week) contract at <a href=https://www.commbank.com.au/ target=_blank rel=noopener>Commonwealth Bank</a>. I joined a team of very strong data scientists – it looks like it&rsquo;s going to be interesting.</p><p><strong>Highlights:</strong> seeing my DuckDuckGo work every time I search for a Python package, the work environment at GetUp<br><strong>Lowlights:</strong> chasing leads that never eventuated</p><h3 id=bandcamp-recommender-bcrecommender>Bandcamp Recommender (BCRecommender)<a hidden class=anchor aria-hidden=true href=#bandcamp-recommender-bcrecommender>#</a></h3><p>I&rsquo;ve written a several posts about <a href=http://www.bcrecommender.com target=_blank rel=noopener>BCRecommender, my Bandcamp music recommendation project</a>. While I&rsquo;ve always treated it as a side-project, it&rsquo;s been useful in learning how to get traction for a product. It now has thousands of monthly users, and is still growing. My goal for BCRecommender has changed from the original one of finding music for myself to growing it enough to be a noticeable source of traffic for Bandcamp, thereby helping artists and fans. Doing it in side-project mode can be a bit challenging at times (because I have so many other things to do and a long list of ideas to make the app better), but I&rsquo;ve been making gradual progress and discovering a lot of great music in the process.</p><p><strong>Highlights:</strong> every time someone gives me positive feedback, every time I listen to music I found using BCRecommender<br><strong>Lowlights:</strong> dealing with <a href=http://parse.com target=_blank rel=noopener>Parse</a> issues and random errors</p><h3 id=price-dingo>Price Dingo<a hidden class=anchor aria-hidden=true href=#price-dingo>#</a></h3><p>The inability to reliably compare prices for many types of products has been bothering me for a while. Unlike general web search, where the main providers rank results by relevance, most Australian price comparison engines still require merchants to pay to even have their products listed. This creates an obvious bias in the results. To address this bias, I created Price Dingo – a user-centric price comparison engine. It serves users with results they can trust by not requiring merchants to pay to have their products listed. Just like general web search engines, the main ranking factor is relevancy to the user. This relevancy is also achieved by implementing Price Dingo as a network of independent sites, each focused on a specific product category, with the first category being scuba diving gear.</p><p>Implementing Price Dingo hasn&rsquo;t been too hard – the main challenge has been finding the time to do it with all the other stuff I&rsquo;ve been doing. There are still plenty of improvements to be made to the site, but now the main goal is to get enough traction to make ongoing time investment worthwhile. Judging by the experience of <a href=http://www.booko.com.au target=_blank rel=noopener>Booko&rsquo;s</a> founder, there is space in the market for niche price comparison sites and apps, so it is just a matter of execution.</p><p><strong>Highlights:</strong> being able to finally compare dive gear prices, the joys of integrating <a href=http://www.algolia.com target=_blank rel=noopener>Algolia</a><br><strong>Lowlights:</strong> extracting data from messy websites – I&rsquo;ve seen some horrible things&mldr;</p><h3 id=marine-conservation>Marine conservation<a hidden class=anchor aria-hidden=true href=#marine-conservation>#</a></h3><p>The first thing I did after leaving my last job was go overseas for five weeks, which included a ten-day visit to Israel (rockets!) and three weeks of conservation diving with <a href=http://www.newheavendiveschool.com/marine-conservation-thailand/ target=_blank rel=noopener>New Heaven Dive School in Thailand</a>. Back in Sydney, I joined the <a href=http://www.urgdiveclub.org.au/ target=_blank rel=noopener>Underwater Research Group of NSW</a>, a dive club that&rsquo;s involved in many marine conservation and research activities, including <a href=http://reeflifesurvey.com/ target=_blank rel=noopener>Reef Life Survey (RLS)</a> and <a href=http://www.urgdiveclub.org.au/urg-and-rfa-clean-up-project/ target=_blank rel=noopener>underwater cleanups</a>. With URG, I&rsquo;ve been diving more than before, and for a change, some of my dives actually do good. I&rsquo;d love to do this kind of stuff full-time, but there&rsquo;s a lot less money in getting people to do less stuff (i.e., conservation and sustainability) than in consuming more. The compromise for now is that a portion of Price Dingo&rsquo;s scuba revenue goes to the <a href=http://www.marineconservation.org.au/ target=_blank rel=noopener>Australian Marine Conservation Society</a>, and the plan is to expand this to other charities as more categories are added. <strong>Update – May 2015:</strong> I decided that this compromise isn&rsquo;t good enough for me, so I shut down Price Dingo to focus on projects that are more aligned with my values.</p><p><strong>Highlights:</strong> <a href=http://www.urgdiveclub.org.au/reef-life-survey-training-review/ target=_blank rel=noopener>becoming a certified RLS diver</a>, pretty much every dive<br><strong>Lowlights:</strong> cutting my hand open by falling on rocks on the first day of diving in Thailand</p><h3 id=the-future>The future<a hidden class=anchor aria-hidden=true href=#the-future>#</a></h3><p>So far, I&rsquo;m pretty happy with this not-having-a-job-doing-my-own-thing business. According to <a href=http://www.tropicalmba.com/living-the-dream/ target=_blank rel=noopener>The 1000 Day Rule</a>, I still have a long way to go until I get the lifestyle I want. It may even take longer than 1000 days given my decision to not work full-time on a single profitable project, together with my tendency to take more time off than I would if I had a &ldquo;real&rdquo; job. But the beauty of this path is that there are no investors breathing down my neck or the feeling of mental rot that comes with a full-time job, so there&rsquo;s really no rush and I can just enjoy the ride.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on x" href="https://x.com/intent/tweet/?text=The%20long%20road%20to%20a%20lifestyle%20business&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f&amp;hashtags=business%2cdatascience%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f&amp;title=The%20long%20road%20to%20a%20lifestyle%20business&amp;summary=The%20long%20road%20to%20a%20lifestyle%20business&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f&title=The%20long%20road%20to%20a%20lifestyle%20business"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on whatsapp" href="https://api.whatsapp.com/send?text=The%20long%20road%20to%20a%20lifestyle%20business%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on telegram" href="https://telegram.me/share/url?text=The%20long%20road%20to%20a%20lifestyle%20business&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The long road to a lifestyle business on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20long%20road%20to%20a%20lifestyle%20business&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f03%2f22%2fthe-long-road-to-a-lifestyle-business%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1049><div class=comment-header><a href=#comment-1049><img class=comment-avatar src="https://www.gravatar.com/avatar/76347f8703adb9394f74bf150edb9b19?s=50"><p class=comment-info><strong>Ralph Haygood</strong><br><small>2016-01-29 06:08:15</small></p></a></div><div class="comment-body post-content">&ldquo;What I really wanted was a stable part-time gig.&rdquo;: They&rsquo;re remarkably hard to find. It&rsquo;s an absurdity of our time that many people are overemployed - selling more of their time than they want for more money than they need - even while many other people are underemployed - unable to sell enough of their time for enough money to live comfortably.</div></div><div class=comment-level-1 id=comment-1052><div class=comment-header><a href=#comment-1052><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-01-30 09:27:23</small></p></a></div><div class="comment-body post-content">That&rsquo;s very true. The interesting thing is that it&rsquo;s a problem that is not unique to this century. It was discussed by Thoreau in <a href=https://en.wikipedia.org/wiki/Walden target=_blank rel=nofollow>Walden</a> (1854), Bertrand Russell in <a href=http://www.zpub.com/notes/idle.html target=_blank rel=nofollow>In Praise of Idleness</a> (1932), and David Graeber in <a href=http://strikemag.org/bullshit-jobs/ target=_blank rel=nofollow>On the Phenomenon of Bullshit Jobs</a> (2013), to name a few. People seem to be worried about robots taking their jobs, but the scarier thought is that robots will never take our jobs, because we&rsquo;ll keep coming up with ways of staying employed rather than enjoy the affluence afforded by technological advancements.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/04/24/my-divestment-from-fossil-fuels/index.html b/2015/04/24/my-divestment-from-fossil-fuels/index.html
index 98388aa1f..eacdd8f01 100644
--- a/2015/04/24/my-divestment-from-fossil-fuels/index.html
+++ b/2015/04/24/my-divestment-from-fossil-fuels/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="climate change,divestment,environment,fossil fuels"><meta name=description content="Recent choices I&rsquo;ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="My divestment from fossil fuels"><meta property="og:description" content="Recent choices I&rsquo;ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/"><meta property="og:image" content="https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2015-04-24T00:19:36+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry.jpg"><meta name=twitter:title content="My divestment from fossil fuels"><meta name=twitter:description content="Recent choices I&rsquo;ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"My divestment from fossil fuels","item":"https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"My divestment from fossil fuels","name":"My divestment from fossil fuels","description":"Recent choices I\u0026rsquo;ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons.","keywords":["climate change","divestment","environment","fossil fuels"],"articleBody":" This post covers recent choices I've made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons. I recently read Naomi Klein’s This Changes Everything, which deeply influenced me. The book describes how the world has been dragging its feet when it comes to reducing carbon emissions, and how we are coming very close to a point where climate change is likely to spin out of control. While many of the facts presented in the book can be very depressing, one ray of light is that it is still not too late to act. There are still things we can do to avoid catastrophic climate change.\nOne such thing is divestment from fossil fuels. Fossil fuel companies have committed to extracting (and therefore burning) more than what scientists agree is the safe amount of carbon that can be pumped into the atmosphere. While governments have been rather ineffective in stopping this (the current Australian government is even embarrassingly rolling back emission-reduction measures), divesting your money from such companies can help take away the social licence of these companies to do as they please. Further, this may be a smart investment strategy because the world is moving towards renewable energy. Indeed, according to one index, investors who divested from fossil fuels have had higher returns than conventional investors over the last five years.\nIt’s worth noting that even if you disagree with the scientific consensus that releasing billions of tonnes of greenhouse gases into the atmosphere increases the likelihood of climate change, you should agree that it’d be better to stop breathing all the pollutants that result from burning fossil fuels. Further, the environmental damage that comes with extracting fossil fuels is something worth avoiding. Examples include the Deepwater Horizon oil spill, numerous cases of poisoned water due to fracking, and the potential damage to the Great Barrier Reef due to coal mine expansion. Even climate change deniers would admit that divestment from fossil fuels and a rapid move to clean renewables will prevent such disasters.\nThe rest of this post describes steps I’ve recently taken towards divesting from fossil fuels. These are mostly relevant to Australians, though other countries may have similar options.\nSuperannuation In Australia, we have compulsory superannuation (commonly known as super), meaning that most working Australians have some money invested somewhere. As this money is only available at retirement, investors can afford to optimise for long-term returns. Many super funds allow investors to choose what to invest in, and switching funds is relatively straightforward. My super fund is UniSuper. Last week, I switched my plan from Balanced, which includes investments in coal miners Rio Tinto and BHP Billiton, to 75% Sustainable Balanced, which doesn’t directly invest in fossil fuels, and 25% Global Environment Opportunities, which is focused on companies with a green agenda such as Tesla. This switch was very simple – I wish I had done it earlier. If you’re interested in making a similar switch, check out Superswitch’s guide to fossil-free super options.\nEnergy While our previous energy retailer (ClickEnergy) isn’t one of the big three retailers who are actively lobbying the government to reduce the renewable energy target for 2020, my partner and I decided to switch to Powershop, as it appears to be the greenest energy retailer in New South Wales. Powershop supports maintaining the renewable energy target in its current form and provides free carbon offsets for all non-renewable energy. In addition, Powershop allows customers to purchase 100% green power from renewables – an option that we choose to take. With the savings from moving to Powershop and the extra payment for green power, our bill is expected to be more or less the same as before. Everyone wins!\nNote: If you live in New South Wales or Victoria and generally support what GetUp is doing, you can sign up via the links on this page, and GetUp will be paid a referral fee by Powershop.\nBanking There’s been a lot of focus recently on financing provided by the major banks to fossil fuel companies. The problem is that – unlike with super and energy – there aren’t many viable alternatives to the big banks. Reading the statements by smaller banks and credit unions, it is clear that they don’t provide financing to polluters just because they’re too small or not focused on commercial lending. Further, some of the smaller banks invest their money with the bigger banks. If the smaller banks were to become big due to the divestment movement, they may end up financing polluters. Unfortunately, changing your bank doesn’t give you more control over how your chosen financial institute uses your money.\nFor now, I think it makes sense to push the banks to become fossil free by putting them on notice or participating in demonstrations. With enough pressure, one of the big banks may make a strong statement against lending to polluters, and then it’ll be time to act on the notices. One thing that the big banks care about is customer satisfaction and public image. Sending a strong message about the connection between financing polluters and satisfaction may be enough to make a difference. I’ll be tracking news in this area and will possibly make a switch in the future, depending on how things evolve.\nTransportation My top transportation choices are cycling and public transport, followed by driving when the former two are highly inconvenient (e.g., when going scuba diving). Every bike ride means less pollution and is a vote against fossil fuels. Further, bike riding is my main form of exercise, so I don’t need to set aside time to go to the gym. Finally, it’s almost free, and it’s also the fastest way of getting to the city from where I live.\nSince January, I’ve been allowing people to borrow my car through Car Next Door. This service, which is currently active in Sydney and Melbourne, allows people to hire their neighbours’ cars, thereby reducing the number of cars on the road. They also carbon offset all the rides taken through the service. While making my car available has made using it slightly less convenient (because I need to book it for myself), it’s also saved me money, so far covering the cost of insurance and roadside assistance. With my car sitting idle for 95% of the time before joining Car Next Door, it’s definitely another win-win situation. If you’d like to join Car Next Door as either a borrower or an owner, you can use this link to get $15 credit.\nOther areas and next steps Many of the choices we make every day have the power to reduce energy demand. These choices often make our life better, as seen with the bike riding example above. There’s a lot of material online about these green choices, which I may cover from my angle in another post. In general, I’m planning to be more active in the area of environmentalism. While this may come at the cost of reduced focus on my other activities, I would rather be more a part of the solution than a part of the problem. I’ll update as I go – please subscribe to get notified when updates occur.\n","wordCount":"1209","inLanguage":"en","image":"https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry.jpg","datePublished":"2015-04-24T00:19:36Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">My divestment from fossil fuels</h1><div class=post-meta><span title='2015-04-24 00:19:36 +0000 UTC'>April 24, 2015</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry_hu578451d39f2ee65bac6accbf307997d3_141026_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry_hu578451d39f2ee65bac6accbf307997d3_141026_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry_hu578451d39f2ee65bac6accbf307997d3_141026_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry_hu578451d39f2ee65bac6accbf307997d3_141026_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry.jpg 1280w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/industry.jpg alt width=1280 height=653></figure><div class=post-content><p class=intro-note>This post covers recent choices I've made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons.</p><p>I recently read <a href=http://thischangeseverything.org/ target=_blank rel=noopener>Naomi Klein&rsquo;s This Changes Everything</a>, which deeply influenced me. The book describes how the world has been dragging its feet when it comes to reducing carbon emissions, and how we are coming very close to a point where climate change is likely to spin out of control. While many of the facts presented in the book can be very depressing, one ray of light is that it is still not too late to act. There are still things we can do to avoid catastrophic climate change.</p><p>One such thing is <a href=http://gofossilfree.org/ target=_blank rel=noopener>divestment from fossil fuels</a>. Fossil fuel companies have committed to extracting (and therefore burning) <a href=https://theconversation.com/unburnable-carbon-why-we-need-to-leave-fossil-fuels-in-the-ground-40467 target=_blank rel=noopener>more than what scientists agree is the safe amount of carbon that can be pumped into the atmosphere</a>. While governments have been rather ineffective in stopping this (the current Australian government is even <a href=https://www.facebook.com/theprojecttv/videos/10152808607343441/ target=_blank rel=noopener>embarrassingly rolling back emission-reduction measures</a>), divesting your money from such companies can help take away the social licence of these companies to do as they please. Further, this may be a smart investment strategy because the world is moving towards renewable energy. Indeed, according to one index, <a href=http://www.theguardian.com/environment/2015/apr/10/fossil-fuel-free-funds-out-performed-conventional-ones-analysis-shows target=_blank rel=noopener>investors who divested from fossil fuels have had higher returns than conventional investors over the last five years</a>.</p><p>It&rsquo;s worth noting that even if you disagree with the scientific consensus that releasing <a href=https://en.wikipedia.org/wiki/Greenhouse_gas target=_blank rel=noopener>billions of tonnes of greenhouse gases</a> into the atmosphere increases the likelihood of climate change, you should agree that it&rsquo;d be better to stop breathing all the pollutants that result from burning fossil fuels. Further, the environmental damage that comes with extracting fossil fuels is something worth avoiding. Examples include <a href=https://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill target=_blank rel=noopener>the Deepwater Horizon oil spill</a>, <a href=https://en.wikipedia.org/wiki/Environmental_impact_of_hydraulic_fracturing target=_blank rel=noopener>numerous cases of poisoned water due to fracking</a>, and <a href=http://fightforthereef.org.au/ target=_blank rel=noopener>the potential damage to the Great Barrier Reef due to coal mine expansion</a>. Even climate change deniers would admit that divestment from fossil fuels and a rapid move to clean renewables will prevent such disasters.</p><p>The rest of this post describes steps I&rsquo;ve recently taken towards divesting from fossil fuels. These are mostly relevant to Australians, though other countries may have similar options.</p><h3 id=superannuation>Superannuation<a hidden class=anchor aria-hidden=true href=#superannuation>#</a></h3><p>In Australia, we have <a href=https://en.wikipedia.org/wiki/Superannuation_in_Australia target=_blank rel=noopener>compulsory superannuation</a> (commonly known as <em>super</em>), meaning that most working Australians have some money invested somewhere. As this money is only available at retirement, investors can afford to optimise for long-term returns. Many super funds allow investors to choose what to invest in, and switching funds is relatively straightforward. My super fund is <a href=http://www.unisuper.com.au/ target=_blank rel=noopener>UniSuper</a>. Last week, I switched my plan from <a href=http://www.unisuper.com.au/investments/investment-options-and-performance/super-performance-and-option-holdings/balanced target=_blank rel=noopener>Balanced</a>, which includes investments in coal miners Rio Tinto and BHP Billiton, to 75% <a href=http://www.unisuper.com.au/investments/investment-options-and-performance/super-performance-and-option-holdings/sustainable-balanced target=_blank rel=noopener>Sustainable Balanced</a>, which doesn&rsquo;t directly invest in fossil fuels, and 25% <a href=http://www.unisuper.com.au/investments/investment-options-and-performance/super-performance-and-option-holdings/global-environmental-opportunities target=_blank rel=noopener>Global Environment Opportunities</a>, which is focused on companies with a green agenda such as Tesla. This switch was very simple – I wish I had done it earlier. If you&rsquo;re interested in making a similar switch, check out <a href=http://superswitch.org.au/ target=_blank rel=noopener>Superswitch&rsquo;s guide to fossil-free super options</a>.</p><h3 id=energy>Energy<a hidden class=anchor aria-hidden=true href=#energy>#</a></h3><p>While our previous energy retailer (ClickEnergy) isn&rsquo;t one of the big three retailers <a href=https://www.getup.org.au/campaigns/renewable-energy/send-the-dirty-three-a-message/hit-the-dirty-three-where-it-hurts target=_blank rel=noopener>who are actively lobbying the government to reduce the renewable energy target for 2020</a>, my partner and I decided to switch to <a href=http://www.powershop.com.au/ target=_blank rel=noopener>Powershop</a>, as it appears to be the greenest energy retailer in New South Wales. Powershop <a href=http://www.powershop.com.au/renewables/ target=_blank rel=noopener>supports maintaining the renewable energy target in its current form</a> and provides free carbon offsets for all non-renewable energy. In addition, Powershop allows customers to purchase 100% green power from renewables – an option that we choose to take. With the savings from moving to Powershop and the extra payment for green power, our bill is expected to be more or less the same as before. Everyone wins!</p><p>Note: If you live in New South Wales or Victoria and generally support what GetUp is doing, you can sign up via <a href=https://www.getup.org.au/campaigns/renewable-energy/send-the-dirty-three-a-message/hit-the-dirty-three-where-it-hurts target=_blank rel=noopener>the links on this page</a>, and GetUp will be paid a referral fee by Powershop.</p><h3 id=banking>Banking<a hidden class=anchor aria-hidden=true href=#banking>#</a></h3><p>There&rsquo;s been a lot of focus recently on <a href=http://gofossilfree.org.au/fossil-free-banks/ target=_blank rel=noopener>financing provided by the major banks to fossil fuel companies</a>. The problem is that – unlike with super and energy – there aren&rsquo;t many viable alternatives to the big banks. Reading the <a href=http://www.marketforces.org.au/banks/compare target=_blank rel=noopener>statements by smaller banks and credit unions</a>, it is clear that they don&rsquo;t provide financing to polluters just because they&rsquo;re too small or not focused on commercial lending. Further, some of the smaller banks invest their money with the bigger banks. If the smaller banks were to become big due to the divestment movement, they may end up financing polluters. Unfortunately, changing your bank doesn&rsquo;t give you more control over how your chosen financial institute uses your money.</p><p>For now, I think it makes sense to push the banks to become fossil free by <a href=http://action.marketforces.org.au/page/s/banks-on-notice target=_blank rel=noopener>putting them on notice</a> or <a href=http://act.350.org/event/CBA_Week_of_Action/ target=_blank rel=noopener>participating in demonstrations</a>. With enough pressure, one of the big banks may make a strong statement against lending to polluters, and then it&rsquo;ll be time to act on the notices. One thing that the big banks care about is <a href=http://www.roymorgan.com/findings/6028-consumer-sat-with-banks-close-to-record-high-201501262213 target=_blank rel=noopener>customer satisfaction</a> and public image. Sending a strong message about the connection between financing polluters and satisfaction may be enough to make a difference. I&rsquo;ll be tracking news in this area and will possibly make a switch in the future, depending on how things evolve.</p><h3 id=transportation>Transportation<a hidden class=anchor aria-hidden=true href=#transportation>#</a></h3><p>My top transportation choices are cycling and public transport, followed by driving when the former two are highly inconvenient (e.g., when going scuba diving). Every bike ride means less pollution and is a vote against fossil fuels. Further, bike riding is my main form of exercise, so I don&rsquo;t need to set aside time to go to the gym. Finally, it&rsquo;s almost free, and it&rsquo;s also the fastest way of getting to the city from where I live.</p><p>Since January, I&rsquo;ve been allowing people to borrow my car through Car Next Door. This service, which is currently active in Sydney and Melbourne, allows people to hire their neighbours&rsquo; cars, thereby reducing the number of cars on the road. They also <a href=http://www.carnextdoor.com.au/carbon-offset/ target=_blank rel=noopener>carbon offset all the rides taken through the service</a>. While making my car available has made using it slightly less convenient (because I need to book it for myself), it&rsquo;s also saved me money, so far covering the cost of insurance and roadside assistance. With my car sitting idle for 95% of the time before joining Car Next Door, it&rsquo;s definitely another win-win situation. If you&rsquo;d like to join Car Next Door as either a borrower or an owner, you can <a href="http://carnextdoor.ontraport.net/t?orid=26287&opid=2" target=_blank rel=noopener>use this link to get $15 credit</a>.</p><h3 id=other-areas-and-next-steps>Other areas and next steps<a hidden class=anchor aria-hidden=true href=#other-areas-and-next-steps>#</a></h3><p>Many of the choices we make every day have the power to reduce energy demand. These choices often make our life better, as seen with the bike riding example above. There&rsquo;s a lot of material online about these green choices, which I may cover from my angle in another post. In general, I&rsquo;m planning to be more active in the area of environmentalism. While this may come at the cost of reduced focus on <a href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/ title="The long road to a lifestyle business">my other activities</a>, I would rather be more a part of the solution than a part of the problem. I&rsquo;ll update as I go – please subscribe to get notified when updates occur.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/climate-change/>Climate Change</a></li><li><a href=https://yanirseroussi.com/tags/divestment/>Divestment</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/fossil-fuels/>Fossil Fuels</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on x" href="https://x.com/intent/tweet/?text=My%20divestment%20from%20fossil%20fuels&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f&amp;hashtags=climatechange%2cdivestment%2cenvironment%2cfossilfuels"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f&amp;title=My%20divestment%20from%20fossil%20fuels&amp;summary=My%20divestment%20from%20fossil%20fuels&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f&title=My%20divestment%20from%20fossil%20fuels"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on whatsapp" href="https://api.whatsapp.com/send?text=My%20divestment%20from%20fossil%20fuels%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on telegram" href="https://telegram.me/share/url?text=My%20divestment%20from%20fossil%20fuels&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My divestment from fossil fuels on ycombinator" href="https://news.ycombinator.com/submitlink?t=My%20divestment%20from%20fossil%20fuels&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f04%2f24%2fmy-divestment-from-fossil-fuels%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-3139><div class=comment-header><a href=#comment-3139><img class=comment-avatar src="https://www.gravatar.com/avatar/ddeb90456a90eeded4b5ca639d404d1b?s=50"><p class=comment-info><strong>Alex</strong><br><small>2018-12-25 13:44:02</small></p></a></div><div class="comment-body post-content">Thanks for sharing your standpoint on this.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/index.html b/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/index.html
index 738f8ebfa..081c56642 100644
--- a/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/index.html
+++ b/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,machine learning,predictive modelling,sentiment analysis,software engineering"><meta name=description content="I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="First steps in data science: author-aware sentiment analysis"><meta property="og:description" content="I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/"><meta property="og:image" content="https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2015-05-02T08:31:10+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps.jpg"><meta name=twitter:title content="First steps in data science: author-aware sentiment analysis"><meta name=twitter:description content="I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"First steps in data science: author-aware sentiment analysis","item":"https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"First steps in data science: author-aware sentiment analysis","name":"First steps in data science: author-aware sentiment analysis","description":"I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.","keywords":["data science","machine learning","predictive modelling","sentiment analysis","software engineering"],"articleBody":"People often ask me what’s the best way of becoming a data scientist. The way I got there was by first becoming a software engineer and then doing a PhD in what was essentially data science (before it became such a popular term). This post describes my first steps in the field with the goal of helping others who are interested in making the transition from pure software engineering to data science.\nWhile my first steps were in a PhD program, I don’t think that going through the formal PhD process is necessary if you wish to become a data scientist. Self-motivated individuals can get very far by making use of the abundance of learning resources available online. In fact, one can make progress much faster than in a PhD, because PhD programs have many overheads.\nThis post is organised as a list of steps. Despite the sequential numbering, many steps can be done in parallel. These steps roughly recount the work I’ve done to publish my first paper, which was co-authored by Ingrid Zukerman and Fabian Bohnert. Most of the technical details are intentionally omitted. Readers who are interested in learning more are invited to read the original paper or chapter 6 in my thesis, which includes more thorough experiments and explanations.\nStep one: Find a problem to work on Even if you know nothing about the machine learning and statistics side of data science, it’s important to find a problem to work on. Ideally it’d be something you find personally interesting, as this helps with motivation. You could use a predefined problem such as a Kaggle competition or one of the UCI datasets. Alternatively, you could collect the data yourself to make things a bit more challenging.\nIn my case, I was interested in natural language processing and user modelling. My supervisor was given a grant to work on sentiment analysis of opinion polls, which was my first direction of research. This quickly changed to focus on the connection between authors and the way they express their sentiments, with the application of harnessing this connection to improve the accuracy of sentiment analysis algorithms. For the purpose of this research, I collected a dataset of texts by the most prolific IMDb users. The problem was to infer the ratings these users assigned to their own reviews, with the hypothesis that methods that take author identity into account would outperform methods that ignore authorship information.\nStep two: Close your knowledge gaps Whatever problem you choose, you will have some knowledge gaps that require filling. Wikipedia, textbooks, and online courses will be your best guide for foundational areas like machine learning and statistics. Reading academic papers is often required to get a better understanding of recent work on the specific problem you’re trying to solve.\nDoing a PhD afforded me the luxury of spending about a month just reading papers. Most of the ~200 papers I read were on sentiment analysis, which gave me a good overview of what’s been done in the field. However, the best thing I’ve done was to stop reading and move on to working on the problem. This is also the best advice I can give: there’s no better way to learn than getting your hands dirty working on a problem.\nStep three: Get your hands dirty With a well-defined problem and the knowledge gaps more-or-less closed, it is time to come up with a plan and implement it. Due to my background in software engineering and some exposure to early collaborative filtering approaches to recommender systems, my plan was very much a part of what Leo Breiman called the algorithmic modelling culture. That is, I was more focused on developing algorithms that work than on modelling the process that generated the data. This approach is arguably more in line with the mindset that software engineers tend to have than with the approach of mathematicians and statisticians.\nThe plan was quite simple:\nReproduce results that showed that rating inference models trained on enough texts by the target author (i.e., the author who wrote the text whose rating we want to predict) outperform models trained on texts by multiple authors Use an approach inspired by collaborative filtering to combine multiple single-author models to infer ratings for texts by the target author, where those models are weighted by similarity to the target author Experiment with multiple similarity measurements under various constraints on the number of texts available by the training and target authors Iterate on these ideas until the results are publishable The rationale behind this plan was that while different people express their sentiments differently, similar people would express their sentiments similarly (e.g., use of understatements varies by culture). The key motivation was Pang and Lee’s finding that a model trained on a single author is best if we have enough texts by this author.\nThe way I implemented the plan was vastly different from how I’d do it today. This was 2009, and using Java with the Weka package for the core modelling seemed like a huge improvement over the C/C++ I was used to. I relied heavily on the university grid to run experiments and wrote a bunch of code to handle experimental logic, including some Perl scripts for post-processing. It ended up being pretty messy, but it worked and I got publishable results. If I were to do the same work today, I’d use Python for everything. IPython Notebook is a great way of keeping track of experimental work, and Python packages like pandas, scikit-learn, gensim, TextBlob, etc. are mature and easy to use for data science applications.\nStep four: Publish your results Having a deadline for publishing results can be stressful, but it has two positive outcomes. First, making your work public allows you to obtain valuable feedback. Second, hard deadlines are great in making you work towards a tangible goal. You can always keep iterating to get infinitesimal improvements, but publication deadlines force you to decide that you’ve done enough.\nIn my case, the deadline for the UMAP 2010 conference and the promise of a free trip to Hawaii served as excellent motivators. But even if you don’t have the time or energy to get an academic paper published, you should set yourself a deadline to publish something on a blog or a forum, or even as a report to a mentor who can assess your work. Receiving continuous feedback is a key factor in improvement, so release early and release often.\nStep five: Improve results or move on Congratulations! You have published the results of your study. What now? You can either keep working on the same problem – try more approaches, add more data, change the constraints, etc. Or you can move on to work on other problems that interest you.\nIn my case, I had to go back to iterate on the results of the first paper because of things I learned later. I ended up rerunning all the experiments to make things fit together into a more-or-less coherent story for the thesis (writing a thesis is one of the main overheads that comes with doing a PhD). If I had a choice, I wouldn’t have done that. I would instead have pursued more sensible enhancements to the work presented in the paper, such as using the author as a feature, employing more robust ensemble methods, and testing different base methods than support vector machines. Nonetheless, I still think that the core idea – that the identity of authors should be taken into account in sentiment analysis – is still relevant and viable today. But I’ve taken my own advice and moved on.\n","wordCount":"1274","inLanguage":"en","image":"https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps.jpg","datePublished":"2015-05-02T08:31:10Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">First steps in data science: author-aware sentiment analysis</h1><div class=post-meta><span title='2015-05-02 08:31:10 +0000 UTC'>May 2, 2015</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps_hu71ce6d56294695860e76fe8bc29b8d4b_64845_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps_hu71ce6d56294695860e76fe8bc29b8d4b_64845_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps.jpg 635w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/kitten-first-steps.jpg alt width=635 height=220></figure><div class=post-content><p>People often ask me what&rsquo;s the best way of becoming a <a href=https://yanirseroussi.com/2014/10/23/what-is-data-science/ title="What is data science?" target=_blank rel=noopener>data scientist</a>. The way I got there was by first becoming a software engineer and then doing a PhD in what was essentially data science (before it became such a popular term). This post describes my first steps in the field with the goal of helping others who are interested in making the transition from pure software engineering to data science.</p><p>While my first steps were in a <a href=https://yanirseroussi.com/phd-work/ title="PhD Work" target=_blank rel=noopener>PhD program</a>, I don&rsquo;t think that going through the formal PhD process is necessary if you wish to become a data scientist. Self-motivated individuals can get very far by making use of the abundance of learning resources available online. In fact, one can make progress much faster than in a PhD, because PhD programs have many overheads.</p><p>This post is organised as a list of steps. Despite the sequential numbering, many steps can be done in parallel. These steps roughly recount the work I&rsquo;ve done to publish my first paper, which was co-authored by <a href=http://users.monash.edu/~ingrid/ target=_blank rel=noopener>Ingrid Zukerman</a> and <a href=https://sites.google.com/a/bohnert.eu/fabian-bohnert/ target=_blank rel=noopener>Fabian Bohnert</a>. Most of the technical details are intentionally omitted. Readers who are interested in learning more are invited to read the <a href=https://dl.dropboxusercontent.com/u/25632965/SeroussiZukermanBohnert2010b.pdf title="Collaborative Inference of Sentiments from Texts" target=_blank rel=noopener>original paper</a> or chapter 6 in <a href=http://arrow.monash.edu.au/vital/access/services/Download/monash:89860/THESIS01 title="Text Mining and Rating Prediction with Topical User Models" target=_blank rel=noopener>my thesis</a>, which includes more thorough experiments and explanations.</p><h3 id=step-one-find-a-problem-to-work-on>Step one: Find a problem to work on<a hidden class=anchor aria-hidden=true href=#step-one-find-a-problem-to-work-on>#</a></h3><p>Even if you know nothing about the machine learning and statistics side of data science, it&rsquo;s important to find a problem to work on. Ideally it&rsquo;d be something you find personally interesting, as this helps with motivation. You could use a predefined problem such as a <a href=http://www.kaggle.com/competitions target=_blank rel=noopener>Kaggle competition</a> or one of the <a href=http://archive.ics.uci.edu/ml/datasets.html target=_blank rel=noopener>UCI datasets</a>. Alternatively, you could collect the data yourself to make things a bit more challenging.</p><p>In my case, I was interested in <a href=http://www.csse.monash.edu.au/research/umnl/ target=_blank rel=noopener>natural language processing and user modelling</a>. My supervisor was given a grant to work on <a href=https://en.wikipedia.org/wiki/Sentiment_analysis target=_blank rel=noopener>sentiment analysis</a> of opinion polls, which was my first direction of research. This quickly changed to focus on the connection between authors and the way they express their sentiments, with the application of harnessing this connection to improve the accuracy of sentiment analysis algorithms. For the purpose of this research, I collected a dataset of texts by the most prolific <a href=http://www.imdb.com/ target=_blank rel=noopener>IMDb</a> users. The problem was to infer the ratings these users assigned to their own reviews, with the hypothesis that methods that take author identity into account would outperform methods that ignore authorship information.</p><h3 id=step-two-close-your-knowledge-gaps>Step two: Close your knowledge gaps<a hidden class=anchor aria-hidden=true href=#step-two-close-your-knowledge-gaps>#</a></h3><p>Whatever problem you choose, you will have some knowledge gaps that require filling. Wikipedia, textbooks, and online courses will be your best guide for foundational areas like machine learning and statistics. Reading academic papers is often required to get a better understanding of recent work on the specific problem you&rsquo;re trying to solve.</p><p>Doing a PhD afforded me the luxury of spending about a month just reading papers. Most of the ~200 papers I read were on sentiment analysis, which gave me a good overview of what&rsquo;s been done in the field. However, the best thing I&rsquo;ve done was to stop reading and move on to working on the problem. This is also the best advice I can give: there&rsquo;s no better way to learn than getting your hands dirty working on a problem.</p><h3 id=step-three-get-your-hands-dirty>Step three: Get your hands dirty<a hidden class=anchor aria-hidden=true href=#step-three-get-your-hands-dirty>#</a></h3><p>With a well-defined problem and the knowledge gaps more-or-less closed, it is time to come up with a plan and implement it. Due to my background in software engineering and some exposure to <a href=https://en.wikipedia.org/wiki/Collaborative_filtering#Memory-based target=_blank rel=noopener>early collaborative filtering approaches to recommender systems</a>, my plan was very much a part of what Leo Breiman called the <a href=http://projecteuclid.org/euclid.ss/1009213726 title="Statistical Modeling: The Two Cultures" target=_blank rel=noopener>algorithmic modelling culture</a>. That is, I was more focused on developing algorithms that work than on modelling the process that generated the data. This approach is arguably more in line with the mindset that software engineers tend to have than with the approach of mathematicians and statisticians.</p><p>The plan was quite simple:</p><ul><li>Reproduce results that showed that rating inference models trained on enough texts by the <em>target author</em> (i.e., the author who wrote the text whose rating we want to predict) outperform models trained on texts by multiple authors</li><li>Use an approach inspired by collaborative filtering to combine multiple single-author models to infer ratings for texts by the target author, where those models are weighted by similarity to the target author</li><li>Experiment with multiple similarity measurements under various constraints on the number of texts available by the training and target authors</li><li>Iterate on these ideas until the results are publishable</li></ul><p>The rationale behind this plan was that while different people express their sentiments differently, similar people would express their sentiments similarly (e.g., use of understatements varies by culture). The key motivation was <a href=http://arxiv.org/pdf/cs/0506075.pdf target=_blank rel=noopener>Pang and Lee&rsquo;s finding</a> that a model trained on a single author is best if we have enough texts by this author.</p><p>The way I implemented the plan was vastly different from how I&rsquo;d do it today. This was 2009, and using Java with the <a href=http://www.cs.waikato.ac.nz/ml/weka/ target=_blank rel=noopener>Weka package</a> for the core modelling seemed like a huge improvement over the C/C++ I was used to. I relied heavily on the university grid to run experiments and wrote a bunch of code to handle experimental logic, including some Perl scripts for post-processing. It ended up being pretty messy, but it worked and I got publishable results. If I were to do the same work today, I&rsquo;d use Python for everything. <a href=http://ipython.org/notebook.html target=_blank rel=noopener>IPython Notebook</a> is a great way of keeping track of experimental work, and Python packages like pandas, scikit-learn, gensim, TextBlob, etc. are mature and easy to use for data science applications.</p><h3 id=step-four-publish-your-results>Step four: Publish your results<a hidden class=anchor aria-hidden=true href=#step-four-publish-your-results>#</a></h3><p>Having a deadline for publishing results can be stressful, but it has two positive outcomes. First, making your work public allows you to obtain valuable feedback. Second, hard deadlines are great in making you work towards a tangible goal. You can always keep iterating to get infinitesimal improvements, but publication deadlines force you to decide that you&rsquo;ve done enough.</p><p>In my case, the deadline for the <a href=http://www.um.org/ target=_blank rel=noopener>UMAP 2010 conference</a> and the promise of a free trip to Hawaii served as excellent motivators. But even if you don&rsquo;t have the time or energy to get an academic paper published, you should set yourself a deadline to publish something on a blog or a forum, or even as a report to a mentor who can assess your work. Receiving continuous feedback is a key factor in improvement, so <a href=https://en.wikipedia.org/wiki/Release_early%2C_release_often target=_blank rel=noopener>release early and release often</a>.</p><h3 id=step-five-improve-results-or-move-on>Step five: Improve results or move on<a hidden class=anchor aria-hidden=true href=#step-five-improve-results-or-move-on>#</a></h3><p>Congratulations! You have published the results of your study. What now? You can either keep working on the same problem – try more approaches, add more data, change the constraints, etc. Or you can move on to work on other problems that interest you.</p><p>In my case, I had to go back to iterate on the results of the first paper because of things I learned later. I ended up rerunning all the experiments to make things fit together into a more-or-less coherent story for the thesis (writing a thesis is one of the main overheads that comes with doing a PhD). If I had a choice, I wouldn&rsquo;t have done that. I would instead have pursued more sensible enhancements to the work presented in the paper, such as using the author as a feature, employing more robust ensemble methods, and testing different base methods than support vector machines. Nonetheless, I still think that the core idea – that the identity of authors should be taken into account in sentiment analysis – is still relevant and viable today. But I&rsquo;ve taken my own advice and moved on.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/sentiment-analysis/>Sentiment Analysis</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on x" href="https://x.com/intent/tweet/?text=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f&amp;hashtags=datascience%2cmachinelearning%2cpredictivemodelling%2csentimentanalysis%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f&amp;title=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis&amp;summary=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f&title=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on whatsapp" href="https://api.whatsapp.com/send?text=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on telegram" href="https://telegram.me/share/url?text=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share First steps in data science: author-aware sentiment analysis on ycombinator" href="https://news.ycombinator.com/submitlink?t=First%20steps%20in%20data%20science%3a%20author-aware%20sentiment%20analysis&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f05%2f02%2ffirst-steps-in-data-science-author-aware-sentiment-analysis%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-370><div class=comment-header><a href=#comment-370><img class=comment-avatar src="https://www.gravatar.com/avatar/600eae285b7bf6c93fca8c0bf155589c?s=50"><p class=comment-info><strong>Robert Klein (@PaperbackLegacy)</strong><br><small>2015-05-18 18:34:00</small></p></a></div><div class="comment-body post-content">Thanks for the stimulation. I’m still fascinated by the lure to extract sentiment from text, but it seems like so often the sentiment that the author intended never fully came to expression in the text. Maybe an interdisciplinary approach will be required to teach machines to parse the intentions implicit in text, and, like other media phenomena, a loop will have to form: perhaps the awareness that explicit intentions and sentiment are of benefit to authors in a world that (one day) automates the sorting of all its documents will cause writing styles to adapt. The effect of the best we can on what we’re doing now is one of those things you begin to see a pattern in. Here&rsquo;s an API that correlates patterns of unstructured info: dev.keywordmeme.com Would love your feedback. Let me know if it’s useful to you or if you have any comments. Well done on the carbon post, btw. Glad I found your blog.</div></div><div class=comment-level-1 id=comment-385><div class=comment-header><a href=#comment-385><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-05-22 08:49:02</small></p></a></div><div class="comment-body post-content"><p>Thank you for the comment! I agree that analysing sentiment is very tricky due to the fact that people often don&rsquo;t express themselves so well. If I remember correctly, inter-annotator agreement on some sentiment analysis tasks is only 70-80%, so it&rsquo;s likely that we will ever have perfect performance by machines.</p><p>dev.keywordmeme.com redirects to a github page &ndash; where is the API?</p></div></div><div class=comment-level-2 id=comment-395><div class=comment-header><a href=#comment-395><img class=comment-avatar src="https://www.gravatar.com/avatar/600eae285b7bf6c93fca8c0bf155589c?s=50"><p class=comment-info><strong>Robert Klein (@PaperbackLegacy)</strong><br><small>2015-05-26 19:43:10</small></p></a></div><div class="comment-body post-content">You bet. I&rsquo;m fascinated to see what seems to be a real live push toward an interdisciplinary approach. That 70-80% performance might be pushed over the hump by humans with special training until such a time as the process can be formalized. It looks like auditing ML-driven processes could be a new category of employment through this next technological plateau. The human-machine relationship in a friendly old configuration!
 Sorry about the link. This should work: <a href=http://www.keywordmeme.com/ target=_blank rel=noopener>http://www.keywordmeme.com/</a>. It makes you register, just a heads up. Hit the engineers up on github if you have any questions or if things aren&rsquo;t working. Which is possible. Take care! :)</div></div><div class=comment-level-0 id=comment-635><div class=comment-header><a href=#comment-635><img class=comment-avatar src="https://www.gravatar.com/avatar/8830e28e12d6e04e41e17809bf7eb644?s=50"><p class=comment-info><strong>A</strong><br><small>2015-09-04 07:31:28</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir</p><p>Thank you very much for this post. Helpul for somebody like me seeking to be a data scientist.</p><p>I&rsquo;m a software engineer, currently master data architect.</p><p>I&rsquo;m taking MOOCs in order to fill the gaps, so let&rsquo;s say I&rsquo;m on a good track :)</p><p>However, once problem found and hands got dirty, how to find a mentor ? afterwards, get published ?</p><p>I think this would be hard via academic</p></div></div><div class=comment-level-1 id=comment-636><div class=comment-header><a href=#comment-636><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-09-04 08:46:51</small></p></a></div><div class="comment-body post-content"><p>Finding a mentor depends on where you are. Good places to start would be your current workplace (if you work with data scientists), or local meetups (if there are any in your area). Another option would be to contribute to open source projects in the field as a way of getting to know people and getting feedback. Finally, there are courses like <a href=https://www.thinkful.com/courses/learn-data-science-online/ target=_blank rel=nofollow>the one by Thinkful</a>, where you can pay to be mentored.</p><p>Regarding getting published, I agree that it&rsquo;d be hard to get published in many academic venues without help from people who know how it&rsquo;s done. However, you can always start your own blog and link to it from places like Reddit and DataTau. Even if you don&rsquo;t get any feedback, publishing often forces you to think more deeply about the subject of your article.</p></div></div><div class=comment-level-2 id=comment-639><div class=comment-header><a href=#comment-639><img class=comment-avatar src="https://www.gravatar.com/avatar/8830e28e12d6e04e41e17809bf7eb644?s=50"><p class=comment-info><strong>Amine</strong><br><small>2015-09-04 11:18:10</small></p></a></div><div class="comment-body post-content"><p>At the workplace, it will be a bit hard.</p><p>I Live in Paris, meetups would be a good option.</p><p>You&rsquo;re right, publishing forces to think more deeply, feedbacks from readers are also good way to learn.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2015/06/06/hopping-on-the-deep-learning-bandwagon/index.html b/2015/06/06/hopping-on-the-deep-learning-bandwagon/index.html
index 9f5c4aec1..655504b7b 100644
--- a/2015/06/06/hopping-on-the-deep-learning-bandwagon/index.html
+++ b/2015/06/06/hopping-on-the-deep-learning-bandwagon/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/bandcamp-album-covers-by-genre.png 828w," src=https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/bandcamp-album-covers-by-genre_hu98267f967d1b66bf7a519f3fa620b70e_1042875_800x0_resize_box_3.png alt="Bandcamp album covers by genre" loading=lazy></a></figure><p>It is apparent that some genres can be inferred more easily than others, especially when browsing through the full dataset. For example, metal albums tend to be pretty distinct. I doubt that predictive accuracy would be very high, but I think that it can definitely be much better than the random baseline of 10%.</p><p>For training, validation and testing I decided to use a static stratified 80%/10%/10% split of the dataset. It quickly became apparently that the full dataset is too big for development purposes, making it hard to quickly test code on my local machine. To address this, I created a local development dataset, using an 80%/10%/10% split of 1,000 images from the full training subset.</p><p>The code for downloading the dataset and creating the splits is available from the <a href=https://github.com/yanirs/bandcamp-deep-learning target=_blank rel=noopener>project repository on GitHub</a>. This repository will include all the code for the project as it evolves. I will try to keep it well-documented enough to be useful for others, though it assumes some familiarity with Python. If you experience any issues running the code or find any bugs, please let me know.</p><h3 id=getting-started>Getting started<a hidden class=anchor aria-hidden=true href=#getting-started>#</a></h3><p>One of the things that has stopped me from playing with deep learning in the past is the feeling that there is a bit of a steep learning curve around the tools and methods. A lot of the deep learning libraries out there don&rsquo;t seem as mature as general machine learning libraries, such as <a href=http://scikit-learn.org/ target=_blank rel=noopener>scikit-learn</a>. There are also many more parameters to play with when building deep neural networks than when using linear models or algorithms such as random forests. Further, to enable any kind of meaningful experimentation, using a GPU is essential.</p><p>Fortunately, the tools and documentation have matured a lot in recent years. Motivated by <a href=http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/ target=_blank rel=noopener>Daniel Nouri&rsquo;s excellent tutorial on detecting facial keypoints with convolutional neural nets</a>, I decided to use the <a href=http://lasagne.readthedocs.org/ target=_blank rel=noopener>Lasagne package</a> as my starting point. My plan was simple: Convert the MNIST example code to work on my dataset locally, setup an AWS machine with a GPU for full-scale experiments, and then play with various network architectures and techniques to improve accuracy and gain a deeper understanding of deep learning.</p><h3 id=initial-environment-setup>Initial environment setup<a hidden class=anchor aria-hidden=true href=#initial-environment-setup>#</a></h3><p>While Lasagne&rsquo;s MNIST example code is pretty clear – especially once you get your head around the way <a href=http://www.deeplearning.net/software/theano/ target=_blank rel=noopener>Theano</a> works – it doesn&rsquo;t really lend itself to easy experimentation. I addressed this by refactoring the code in several iterations, until I got to the current state, where there&rsquo;s a simple command-line interface that allows me to experiment with different datasets and architectures. This will probably change and become more complex as I start doing more sophisticated things.</p><p>To enable rapid experimentation, I had to set up an AWS machine with a GPU (g2.2xlarge instance). I wrote some simple deployment code using <a href=http://www.fabfile.org/ target=_blank rel=noopener>Fabric</a>, which allows me to setup a machine from scratch, install all the requirements, package the project, and copy it to the remote machine.</p><p>Getting the code running on the CPU was trivial, but I hit several issues when running on the GPU. First, the vanilla Ubuntu 14.04 server I used didn&rsquo;t come with CUDA installed. After trying and failing to get it working by following some tutorials, I ended up going down the easier path of using the <a href=https://github.com/BVLC/caffe/wiki/Caffe-on-EC2-Ubuntu-14.04-Cuda-7 target=_blank rel=noopener>AMI supplied by Caffe</a>. This AMI also has the advantage of coming with Caffe installed (surprisingly), which I may end up using at some point.</p><p>The second issue I encountered was that using the GPU to run Lasagne&rsquo;s enhanced example code on my full dataset was impossible due to memory constraints. The problem was that the example assumes that the entire dataset can fit in the GPU&rsquo;s memory (as discussed <a href=https://github.com/Lasagne/Lasagne/issues/12 target=_blank rel=noopener>here</a> and <a href=https://groups.google.com/forum/#!topic/lasagne-users/6F3gCfgviks target=_blank rel=noopener>here</a>). This took a while to resolve, even though the solution is conceptually simple – just copy the dataset to the GPU in chunks rather than attempt to copy it all in one go. Resolving this issue was a good way of getting a better understanding of what the code does, since I ended up rewriting most of the original example code.</p><h3 id=next-steps>Next steps<a hidden class=anchor aria-hidden=true href=#next-steps>#</a></h3><p>So far, I left the network architecture from the original example mostly untouched, as I was busy collecting the dataset, getting the environment set up, and resolving various issues. One thing I did notice was that the example&rsquo;s architecture diverges on my dataset, so instead I tested my code using a basic multi-layer perceptron architecture with a single hidden layer. This performs about as well as a random classifier on my dataset, but at least it converges. I also tested the modified code on the MNIST dataset and the results are decent, so now it is time to move forward and actually do some modelling, starting with <a href=https://en.wikipedia.org/wiki/Convolutional_neural_network target=_blank rel=noopener>convolutional neural nets</a>.</p><p>The high level plan is to iteratively read tutorials/papers/books, implement ideas, play with parameters, and visualise parts of the network until I&rsquo;m satisfied with the results. The main goal remains to learn as much as possible and get a good intuition of how things work. I&rsquo;ll write more about my experiences in subsequent posts. Stay tuned!</p><p><strong>Update</strong>: <a href=https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/>The second post in the series is now available</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on x" href="https://x.com/intent/tweet/?text=Hopping%20on%20the%20deep%20learning%20bandwagon&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f&amp;hashtags=Bandcamp%2cdatascience%2cdeeplearning%2cmachinelearning%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f&amp;title=Hopping%20on%20the%20deep%20learning%20bandwagon&amp;summary=Hopping%20on%20the%20deep%20learning%20bandwagon&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f&title=Hopping%20on%20the%20deep%20learning%20bandwagon"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on whatsapp" href="https://api.whatsapp.com/send?text=Hopping%20on%20the%20deep%20learning%20bandwagon%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on telegram" href="https://telegram.me/share/url?text=Hopping%20on%20the%20deep%20learning%20bandwagon&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hopping on the deep learning bandwagon on ycombinator" href="https://news.ycombinator.com/submitlink?t=Hopping%20on%20the%20deep%20learning%20bandwagon&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f06%2f06%2fhopping-on-the-deep-learning-bandwagon%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/07/06/learning-about-deep-learning-through-album-cover-classification/index.html b/2015/07/06/learning-about-deep-learning-through-album-cover-classification/index.html
index 5327ab5b6..967ef5361 100644
--- a/2015/07/06/learning-about-deep-learning-through-album-cover-classification/index.html
+++ b/2015/07/06/learning-about-deep-learning-through-album-cover-classification/index.html
@@ -22,7 +22,7 @@
 soul" loading=lazy></a><figcaption><p><a href=https://redesignyourmindmuzik.bandcamp.com/album/the-kool-aid-album target=_blank rel=noopener>The Kool-Aid Album by Mr. Merge</a><br><strong>soul</strong></p></figcaption></figure></td><td>dishrag, paper towel, honeycomb, envelope, chain mail</td><td>symbol, nobody, sign, illustration, color, flag, text, stripes, business, character</td></tr></tbody></table><h3 id=training-from-scratch>Training from scratch<a hidden class=anchor aria-hidden=true href=#training-from-scratch>#</a></h3><p>My initial experiments were with various convnet architectures, where I manually varied the filter sizes and number of layers to have a reasonable number of parameters and ensure that the model is trainable on a GPU with 4GB of memory. As mentioned, this approach yielded unimpressive results. Following the relative success of the fine-tuned CaffeNet baseline, I decided to run more rigorous experiments on variants of AlexNet (which is very similar to CaffeNet).</p><p>Given the large number of hyperparameters that need to be set when training deep convnets, I realised that setting values manually or via grid search is unlikely to yield the best results. To address this, I used <a href=https://github.com/hyperopt/hyperopt target=_blank rel=noopener>hyperopt</a> to search for the best configuration of values. The hyperparameters that were included in the search were the learning method (Nesterov momentum versus Adam with their respective parameters), the learning rate, whether crops are mirrored or not, the number of crops to use (1 or 5), dropout probabilities, the number of hidden units in the fully-connected layers, and the number of filters in each convolutional layer.</p><p>Each configuration suggested by hyperopt was trained for 10 epochs, and the promising setups were trained until results stopped improving. The results of the search were rather disappointing, with the best accuracy being 17.19%. However, I learned a lot by finding hyperparameters in this manner – in the past I&rsquo;ve only used a combination of manual settings with grid search.</p><p>There are many possible reasons for why the results are so poor. It could be that there&rsquo;s just too little data to train a good classifier, which is supported by the inability to beat the fine-tuned results. This is in line with the results obtained by <a href=http://arxiv.org/pdf/1311.2901v3.pdf target=_blank rel=noopener>Zeiler and Fergus (2013)</a>, who found that convnets pretrained on ImageNet performed much better on the Caltech-101 and Caltech-256 datasets than the same networks trained from scratch. However, it could also be that I just didn&rsquo;t run enough experiments – I definitely feel like I haven&rsquo;t explored everything as well as I&rsquo;d like. In addition, I&rsquo;m still building my intuition for what works and why. I should work more on visualising the way the network learns to uncover more hidden gotchas in addition to those I&rsquo;ve already found. Finally, it could be that it&rsquo;s just too hard to distinguish between covers from the genres I chose for the study.</p><h3 id=ideas-for-future-work>Ideas for future work<a hidden class=anchor aria-hidden=true href=#ideas-for-future-work>#</a></h3><p>There are many avenues for improving on the work I&rsquo;ve done so far. The code could definitely be made more robust and better tested, optimised and parallelised. It would be worth investing more in hyperparameter and architecture search, including incorporation of ideas from non-vanilla convnets (e.g., <a href=http://arxiv.org/pdf/1409.4842.pdf target=_blank rel=noopener>GoogLeNet</a>). This search should be guided by visualisation and a deeper understanding of the trained networks, which may also come from analysing class-level accuracy (certain genres seem to be easier to distinguish than others). In addition, more sophisticated preprocessing may yield improved results.</p><p>If the goal were to get the best possible performance on my dataset, I&rsquo;d invest in establishing the human performance baseline on the dataset by running some tests with Mechanical Turk. My guess is that humans would perform better than the algorithms tested so far due to access to external knowledge. Therefore, incorporating external knowledge in the form of manual features or additional data sources may yield the most substantial performance boosts. For example, text on an album cover may contain important clues about its genre, and models pretrained on style datasets may be more suitable than ImageNet models. In addition, it may be beneficial to use a model to detect multiple elements in images where the universe is not restricted to ImageNet classes. This approach was taken by <a href=http://apassant.net/2015/05/14/album-covers-music-deep-learning/ target=_blank rel=noopener>Alexandre Passant, who used Clarifai&rsquo;s API to tag and classify doom metal and K-pop album covers</a>. Finally, using several different models in an ensemble is likely to help squeeze a bit more accuracy out of the dataset.</p><p>Another direction that may be worth exploring is using image data for recommendation work. The reason I chose to work on this problem was my exposure to album covers through my work on <a href=http://www.bcrecommender.com target=_blank rel=noopener>Bandcamp Recommender – a music recommendation system</a>. It is well-known that visual elements influence the way users interact with recommender systems. This is especially true in Bandcamp Recommender&rsquo;s case, as users see the album covers before they choose to play them. This leads me to conjecture that considering features that describe the album covers when generating recommendations would increase user interaction with the system. However, it&rsquo;s hard to tell whether it&rsquo;d increase the overall relevance of the results. You can&rsquo;t judge an album by its cover. Or can you&mldr;?</p><h3 id=conclusion>Conclusion<a hidden class=anchor aria-hidden=true href=#conclusion>#</a></h3><p>While I&rsquo;ve learned a lot from working on this project, there&rsquo;s still much more to discover. It was especially great to learn some generally-applicable lessons about hyperparameter optimisation and improvements to vanilla gradient descent. Despite the many potential ways of improving performance on my dataset, my next steps in the field would probably include working on problems for which obtaining a good solution is feasible and useful. For example, I have some ideas for applications to marine creature identification.</p><p>Feedback and suggestions are always welcome. Please feel free to <a href=https://yanirseroussi.com/about/>contact me privately</a> or via the comments section.</p><p><small><strong>Acknowledgement:</strong> Thanks to Brian Basham and Diogo Moitinho de Almeida for useful tips and discussions.</small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on x" href="https://x.com/intent/tweet/?text=Learning%20about%20deep%20learning%20through%20album%20cover%20classification&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f&amp;hashtags=datascience%2cdeeplearning%2cmachinelearning%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f&amp;title=Learning%20about%20deep%20learning%20through%20album%20cover%20classification&amp;summary=Learning%20about%20deep%20learning%20through%20album%20cover%20classification&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f&title=Learning%20about%20deep%20learning%20through%20album%20cover%20classification"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on whatsapp" href="https://api.whatsapp.com/send?text=Learning%20about%20deep%20learning%20through%20album%20cover%20classification%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on telegram" href="https://telegram.me/share/url?text=Learning%20about%20deep%20learning%20through%20album%20cover%20classification&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Learning about deep learning through album cover classification on ycombinator" href="https://news.ycombinator.com/submitlink?t=Learning%20about%20deep%20learning%20through%20album%20cover%20classification&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f06%2flearning-about-deep-learning-through-album-cover-classification%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/07/31/goodbye-parse-com/index.html b/2015/07/31/goodbye-parse-com/index.html
index a62001d59..eb0377c31 100644
--- a/2015/07/31/goodbye-parse-com/index.html
+++ b/2015/07/31/goodbye-parse-com/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2015/07/31/goodbye-parse-com/bcrecommender-latency-digital-ocean.png 757w," src=https://yanirseroussi.com/2015/07/31/goodbye-parse-com/bcrecommender-latency-digital-ocean.png alt="BCRecommender latency with DigitalOcean" loading=lazy></a></figure><p>Note that this graph is for a simple GET request of the homepage without fetching any of the embedded static assets or running client-side rendering. Handling the request simply populates a Jade template without touching the database. It really shouldn&rsquo;t take too long unless the server is under very heavy load. And even then, Parse is supposed to handle such loads gracefully – not needing to worry about this kind of stuff is the key reason for using a backend-as-a-service!</p><h3 id=final-thoughts>Final thoughts<a hidden class=anchor aria-hidden=true href=#final-thoughts>#</a></h3><p>I really like the idea behind Parse, as setting up and running a web backend is not a trivial task. They do provide some good tooling, and I was happy to work around the minor issues and restrictions that come with working in a sandboxed environment. However, the lack of reliability is a huge disadvantage, even at the attractive price point of $0. Further, there&rsquo;s no indication that paying for the service would increase reliability, as the free tier includes up to 30 requests / second and it can barely handle a single request. Maybe I&rsquo;ll get back to Parse one day, but for now I&rsquo;m much happier with the increased power and responsibility of managing my own servers.</p><p><strong>Update (30 January, 2016):</strong> <a href=http://blog.parse.com/announcements/moving-on/ target=_blank rel=noopener>Facebook has announced it will be shutting Parse down</a>, which is a shame. It could have been a great service if they had just focused more on reliability. You just couldn&rsquo;t run serious apps on Parse, which probably meant that not many apps were upgraded to the paid tiers. It&rsquo;s very disappointing that Facebook didn&rsquo;t help Parse realise its potential, but this isn&rsquo;t the first time a big company takes over a small product and shuts it down. It&rsquo;s <a href=http://ourincrediblejourney.tumblr.com/ target=_blank rel=noopener>just the way of the world</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/parse.com/>Parse.com</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on x" href="https://x.com/intent/tweet/?text=Goodbye%2c%20Parse.com&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f&amp;hashtags=BCRecommender%2cDevOps%2cparse.com%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f&amp;title=Goodbye%2c%20Parse.com&amp;summary=Goodbye%2c%20Parse.com&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f&title=Goodbye%2c%20Parse.com"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on whatsapp" href="https://api.whatsapp.com/send?text=Goodbye%2c%20Parse.com%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on telegram" href="https://telegram.me/share/url?text=Goodbye%2c%20Parse.com&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Goodbye, Parse.com on ycombinator" href="https://news.ycombinator.com/submitlink?t=Goodbye%2c%20Parse.com&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f07%2f31%2fgoodbye-parse-com%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-585><div class=comment-header><a href=#comment-585><img class=comment-avatar src="https://www.gravatar.com/avatar/3e49fdb45638a66a7f3301a6caeb0ad9?s=50"><p class=comment-info><strong>civax</strong><br><small>2015-08-14 11:06:12</small></p></a></div><div class="comment-body post-content"><p>I think it&rsquo;s all about what you expect.
 We used parse for prototypes and it&rsquo;s been working great for us so far. I actually think we still have one of the prototypes running over it which we haven&rsquo;t touched in almost year (a mobile/web strategy game now only available on FB - <a href=https://apps.facebook.com/foresttribes/%29 target=_blank rel=noopener>https://apps.facebook.com/foresttribes/)</a>. It&rsquo;s been great since it also saved us the need to develop a backend admin tool to manage/balance the game or add additional content.</p><p>Over all I never had a real live public product running on parse to comment on the experience, but for prototypes I&rsquo;m perfectly happy with the service.</p></div></div><div class=comment-level-1 id=comment-587><div class=comment-header><a href=#comment-587><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-08-15 08:16:14</small></p></a></div><div class="comment-body post-content">Agreed, it&rsquo;s perfectly fine for prototypes, but a bit too unreliable for public-facing live products. If Parse were more robust, it&rsquo;d be perfect for many use cases.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2015/08/24/you-dont-need-a-data-scientist-yet/index.html b/2015/08/24/you-dont-need-a-data-scientist-yet/index.html
index 9f1a29e0f..7f37da185 100644
--- a/2015/08/24/you-dont-need-a-data-scientist-yet/index.html
+++ b/2015/08/24/you-dont-need-a-data-scientist-yet/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,data business,data science"><meta name=description content="Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="You don’t need a data scientist (yet)"><meta property="og:description" content="Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/"><meta property="og:image" content="https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2015-08-24T08:25:30+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer.jpg"><meta name=twitter:title content="You don’t need a data scientist (yet)"><meta name=twitter:description content="Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"You don’t need a data scientist (yet)","item":"https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"You don’t need a data scientist (yet)","name":"You don’t need a data scientist (yet)","description":"Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist.","keywords":["business","data business","data science"],"articleBody":"The hype around big data has caused many organisations to hire data scientists without giving much thought to what these data scientists are going to do and whether they’re actually needed. This is a source of frustration for all parties involved. This post discusses some questions you should ask yourself before deciding to hire your first data scientist.\nQ1: Do you know what data scientists do? Somewhat surprisingly, there are quite a few companies that hire data scientists without having a clear idea of what data scientists actually do. People seem to have a fear of missing out on the big data hype, and think of hiring data scientists as the solution. A common misconception is that a data scientist’s role includes telling you what to do with your data. While this may sometimes happen in practice, the ideal scenario is where the business has problems that can be solved using data science (more on this under Q3 below). If you don’t know what your data scientist is going to do, you probably don’t need one.\nSo what do data scientists do? When you think about it, adding the word “data” to “science” is a bit redundant, as all science is based on data. Following from this, anyone who does any kind of data analysis is a data scientist. While it may be true, this broad definition is not very helpful. As discussed in a previous post, it’s more useful to define data scientists as individuals who combine expertise in statistics and machine learning with strong software engineering skills.\nQ2: Do you have enough data available? It’s not uncommon to see products that suffer from over-engineering and premature investment in advanced analytics capabilities. In the early stages, it’s important to focus on creating a minimum viable product and getting it to market quickly. Data science starts to shine once the product is generating enough data, as most of the power of advanced analytics is in optimising and automating existing processes.\nNot having a data scientist in the early stages doesn’t mean the data is being ignored – it just means that it doesn’t require the attention of a full-time data scientist. If your product is at an early stage and you are still concerned, you’re better off hiring a data science consultant for a few days to help lay out the long-term vision for data-driven capabilities. This would be cheaper and less time-consuming than hiring a full-timer. The exception to this rule is when the product itself is built around advanced analytics (e.g., AlchemyAPI or Enlitic). Building such products without data scientists is far from ideal, or just impossible.\nEven if your product is mature and generating a lot of data, it doesn’t mean it’s ready for data science. Advanced analytics capabilities are at the top of data’s hierarchy of needs: If your product is buggy, or if your data is scattered everywhere and your platform lacks centralised reporting, you need to first invest in fixing your data plumbing. This is the job of data engineers. Getting data scientists involved when the data is hardly available due to infrastructure issues is likely to lead to frustration. In addition, setting up centralised reporting and dashboarding is likely to give you ideas for problems that data scientists can solve.\nQ3: Do you have a specific problem to solve? If the problem you’re trying to solve is “everyone is doing smart things with data, we should be doing stuff with data too”, you don’t have a specific problem that can be solved by bringing a data scientist on board. Defining the problem often ends up occupying a lot of the data scientist’s time, so you are likely to obtain better results if have more than just a vague idea around “doing something with data, because Hadoop”. Ideally you want to optimise an existing process that is currently being solved with heuristics, make an existing model better, implement a new data-driven feature, or something along these lines. Common examples include reducing churn, increasing conversions, and replacing manual processes with automated data-driven systems. Again, getting advice from experienced data scientists before committing to hiring one may be your best first step.\nQ4: Can you get away with heuristics, intuition, and/or manual processes? Some data scientists would passionately claim that you must deploy only models that are theoretically justified and well-tested. However, in many cases you can get away with using simple heuristics, intuition, and/or manual processes. These can be orders of magnitude cheaper than building sophisticated predictive models and the infrastructure to support them. For many businesses, there are more pressing needs than doing everything in a theoretically sound way. Despite what many technical people like to think, customers don’t tend to care how things are implemented, as long as their needs are fulfilled.\nFor example, I spent some time with a client whose product includes a semi-manual part where structured data is extracted from documents. Their process included sending some of the documents to a trained team in the Philippines for manual analysis. The client was interested in replacing that manual work with a machine learning algorithm. As is often the case with machine learning, it was unknown whether the resultant model would be accurate enough to completely replace the manual workers. This generally depends on data quality and the feasibility of solving the problem. Assessing the feasibility would have taken some time and money, so the client decided to park the idea and focus on other areas of their business.\nEvery business has resource constraints. Situations where the best investment you can make is hiring a full-time data scientist are rarer than what the hype may make you think. It’s often the case that functions that would be the responsibility of a data scientist are adequately performed by existing employees, such as software engineers, business/data analysts, and marketers.\nQ5: Are you committed to being data-driven? I have seen more than one case where data scientists are hired only to be blocked or ignored. This is more prevalent in the corporate world, where managers are often incentivised to prioritise doing things that look good over things that make financial sense. But even if recruitment is done with the best intentions, progress may be blocked by employees who feel threatened because they would be replaced by automated data-driven algorithms. Successful data science projects require support from senior leadership, as discussed by Greta Roberts, Radim Řehůřek, Alec Smith, and many others. Without such support and a strong commitment to making data-driven decisions, everyone is just wasting their time.\nClosing thoughts While data science is currently over-hyped, many organisations still have much to gain from hiring data scientists. I hope that this post has helped you decide whether you need a data scientist right now. If you’re unsure, please don’t hesitate to contact me. And to any data scientists reading this: Be very wary of potential employers who do not have good answers to the above questions. At this point in time you can afford to be picky, at least until the hype is over.\n","wordCount":"1178","inLanguage":"en","image":"https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer.jpg","datePublished":"2015-08-24T08:25:30Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">You don’t need a data scientist (yet)</h1><div class=post-meta><span title='2015-08-24 08:25:30 +0000 UTC'>August 24, 2015</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer_hu2c8b5baf56bd11c08a3f40db9407264b_42562_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer_hu2c8b5baf56bd11c08a3f40db9407264b_42562_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer.jpg 560w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/hammer.jpg alt width=560 height=300></figure><div class=post-content><p>The hype around big data has caused many organisations to hire data scientists without giving much thought to what these data scientists are going to do and whether they&rsquo;re actually needed. This is a source of frustration for all parties involved. This post discusses some questions you should ask yourself before deciding to hire your first data scientist.</p><h3 id=q1-do-you-know-what-data-scientists-do>Q1: Do you know what data scientists do?<a hidden class=anchor aria-hidden=true href=#q1-do-you-know-what-data-scientists-do>#</a></h3><p>Somewhat surprisingly, there are quite a few companies that hire data scientists without having a clear idea of what data scientists actually do. People seem to have a fear of missing out on the big data hype, and think of hiring data scientists as the solution. A common misconception is that a data scientist&rsquo;s role includes telling you what to do with your data. While this may sometimes happen in practice, the ideal scenario is where the business has problems that can be solved using data science (more on this under Q3 below). If you don&rsquo;t know what your data scientist is going to do, you probably don&rsquo;t need one.</p><p>So what do data scientists do? When you think about it, adding the word &ldquo;data&rdquo; to &ldquo;science&rdquo; is a bit redundant, as all science is based on data. Following from this, <a href=http://robjhyndman.com/hyndsight/am-i-a-data-scientist/ target=_blank rel=noopener>anyone who does any kind of data analysis is a data scientist</a>. While it may be true, this broad definition is not very helpful. <a href=https://yanirseroussi.com/2014/10/23/what-is-data-science/>As discussed in a previous post</a>, it&rsquo;s more useful to define data scientists as individuals who combine expertise in statistics and machine learning with strong software engineering skills.</p><h3 id=q2-do-you-have-enough-data-available>Q2: Do you have enough data available?<a hidden class=anchor aria-hidden=true href=#q2-do-you-have-enough-data-available>#</a></h3><p>It&rsquo;s not uncommon to see products that suffer from over-engineering and premature investment in advanced analytics capabilities. In the early stages, it&rsquo;s important to focus on creating a minimum viable product and getting it to market quickly. Data science starts to shine once the product is generating enough data, as most of the power of advanced analytics is in optimising and automating existing processes.</p><p>Not having a data scientist in the early stages doesn&rsquo;t mean the data is being ignored – it just means that it doesn&rsquo;t require the attention of a full-time data scientist. If your product is at an early stage and you are still concerned, you&rsquo;re better off hiring a data science consultant for a few days to help lay out the long-term vision for data-driven capabilities. This would be cheaper and less time-consuming than hiring a full-timer. The exception to this rule is when the product itself is built around advanced analytics (e.g., <a href=http://www.alchemyapi.com/ target=_blank rel=noopener>AlchemyAPI</a> or <a href=http://www.enlitic.com/ target=_blank rel=noopener>Enlitic</a>). Building such products without data scientists is far from ideal, or just impossible.</p><p>Even if your product is mature and generating a lot of data, it doesn&rsquo;t mean it&rsquo;s ready for data science. Advanced analytics capabilities are at the top of <a href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/>data&rsquo;s hierarchy of needs</a>: If your product is buggy, or if your data is scattered everywhere and your platform lacks centralised reporting, you need to first invest in fixing your data plumbing. This is the job of <a href=https://yanirseroussi.com/2014/10/23/what-is-data-science/>data engineers</a>. Getting data scientists involved when the data is hardly available due to infrastructure issues is likely to lead to frustration. In addition, setting up centralised reporting and dashboarding is likely to give you ideas for problems that data scientists can solve.</p><h3 id=q3-do-you-have-a-specific-problem-to-solve>Q3: Do you have a specific problem to solve?<a hidden class=anchor aria-hidden=true href=#q3-do-you-have-a-specific-problem-to-solve>#</a></h3><p>If the problem you&rsquo;re trying to solve is &ldquo;everyone is doing smart things with data, we should be doing stuff with data too&rdquo;, you don&rsquo;t have a specific problem that can be solved by bringing a data scientist on board. Defining the problem often ends up occupying a lot of the data scientist&rsquo;s time, so you are likely to obtain better results if have more than just a vague idea around &ldquo;doing something with data, because Hadoop&rdquo;. Ideally you want to optimise an existing process that is currently being solved with heuristics, make an existing model better, implement a new data-driven feature, or something along these lines. Common examples include reducing churn, increasing conversions, and replacing manual processes with automated data-driven systems. Again, getting advice from experienced data scientists before committing to hiring one may be your best first step.</p><h3 id=q4-can-you-get-away-with-heuristics-intuition-andor-manual-processes>Q4: Can you get away with heuristics, intuition, and/or manual processes?<a hidden class=anchor aria-hidden=true href=#q4-can-you-get-away-with-heuristics-intuition-andor-manual-processes>#</a></h3><p>Some data scientists would passionately claim that you must deploy only models that are theoretically justified and well-tested. However, in many cases you can get away with using simple heuristics, intuition, and/or manual processes. These can be orders of magnitude cheaper than building sophisticated predictive models and the infrastructure to support them. For many businesses, there are more pressing needs than doing everything in a theoretically sound way. Despite what many technical people like to think, customers don&rsquo;t tend to care how things are implemented, as long as their needs are fulfilled.</p><p>For example, I spent some time with a client whose product includes a semi-manual part where structured data is extracted from documents. Their process included sending some of the documents to a trained team in the Philippines for manual analysis. The client was interested in replacing that manual work with a machine learning algorithm. As is often the case with machine learning, it was unknown whether the resultant model would be accurate enough to completely replace the manual workers. This generally depends on data quality and the feasibility of solving the problem. Assessing the feasibility would have taken some time and money, so the client decided to park the idea and focus on other areas of their business.</p><p>Every business has resource constraints. Situations where the best investment you can make is hiring a full-time data scientist are rarer than what the hype may make you think. It&rsquo;s often the case that functions that would be the responsibility of a data scientist are adequately performed by existing employees, such as software engineers, business/data analysts, and marketers.</p><h3 id=q5-are-you-committed-to-being-data-driven>Q5: Are you committed to being data-driven?<a hidden class=anchor aria-hidden=true href=#q5-are-you-committed-to-being-data-driven>#</a></h3><p>I have seen more than one case where data scientists are hired only to be blocked or ignored. This is more prevalent in the corporate world, where managers are often incentivised to prioritise doing things that look good over things that make financial sense. But even if recruitment is done with the best intentions, progress may be blocked by employees who feel threatened because they would be replaced by automated data-driven algorithms. Successful data science projects require support from senior leadership, as discussed by <a href=http://venturebeat.com/2015/07/22/stop-hiring-data-scientists-until-youre-ready-for-data-science/ target=_blank rel=noopener>Greta Roberts</a>, <a href=https://berlinbuzzwords.de/sites/berlinbuzzwords.de/files/media/documents/radim_rehurek-so_you_want_to_be_a_data_science_consultant.pdf target=_blank rel=noopener>Radim Řehůřek</a>, <a href=https://www.linkedin.com/pulse/big-data-science-analytics-australia-alec-smith target=_blank rel=noopener>Alec Smith</a>, and many others. Without such support and a strong commitment to making data-driven decisions, everyone is just wasting their time.</p><h3 id=closing-thoughts>Closing thoughts<a hidden class=anchor aria-hidden=true href=#closing-thoughts>#</a></h3><p>While data science is currently over-hyped, many organisations still have much to gain from hiring data scientists. I hope that this post has helped you decide whether you need a data scientist right now. If you&rsquo;re unsure, please don&rsquo;t hesitate to <a href=https://yanirseroussi.com/about/ target=_blank rel=noopener>contact me</a>. And to any data scientists reading this: Be very wary of potential employers who do not have good answers to the above questions. At this point in time you can afford to be picky, at least until the hype is over.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-business/>Data Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on x" href="https://x.com/intent/tweet/?text=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f&amp;hashtags=business%2cdatabusiness%2cdatascience"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f&amp;title=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29&amp;summary=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f&title=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on whatsapp" href="https://api.whatsapp.com/send?text=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on telegram" href="https://telegram.me/share/url?text=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don’t need a data scientist (yet) on ycombinator" href="https://news.ycombinator.com/submitlink?t=You%20don%e2%80%99t%20need%20a%20data%20scientist%20%28yet%29&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f08%2f24%2fyou-dont-need-a-data-scientist-yet%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-621><div class=comment-header><a href=#comment-621><img class=comment-avatar src="https://www.gravatar.com/avatar/3e83196ec5d22b66453107ead83adc58?s=50"><p class=comment-info><strong>Eric Colson</strong><br><small>2015-08-28 16:10:40</small></p></a></div><div class="comment-body post-content"><p>I enjoyed the post - though I offer some contrary points to consider:</p><p>I have learned that if it is clear that you will need a data scientists (by someone who knows what they do), then you should get them as soon as possible. Don’t wait. Data Scientists work best when they have full context for the problem they are here to solve. Getting them in early allows them to help frame the problem. This framing is critical. If the framing is off, it takes a very long time (sometimes never) to get it back on track. A late-to-the-game data scientists can be too influenced by the the existing framing they are given. They tend to think within that box, when in reality, the box was never the right way to approach the problem. Even if they do see outside of it, it can be very difficult to convince the original framers that there is a better way to do things (people can get quite attached to their vision).</p><p>It also can be wise to NOT WAIT till there is data to analyze. Too often, data is an afterthought. Its important for the data scientist to get in early on the initiative so he or she can help define the needed instrumentation and data acquisition strategy. They can even guide the needs of the data warehouse and other repositories where the newly captured data will reside.</p><p>Further, it is often the case that it is the data scientist that identifies the specific problem to solve. At my company, I estimate that over half of the ideas for new data products, features, and services come from the data science team &ndash; not the business. This is intuitive as the data scientists are the folks that are most intimate with the data and are least constrained by what is possible to do with data. Give them business context and they will come up with problems/solutions that no one has thought of.</p><p>Finally, I find heuristics to be dangerous. At best they are suboptimal, and more often than not, they are just plain wrong (those with extensive A/B testing experience can attest to the fact that our intuition fails us again and again). Undoing a bad heuristics can be very painful - in the technical work, the coordinate work, and in the resetting of expectations. Its hard to get people to not walk on a paved path &mldr; even if that path is the long way or a dead-end.</p><p>I totally agree with &ldquo;Q5: Are you committed to being data-driven?&rdquo;. This comes down to business model and culture. Is your business model one where data science can be the source of strategic differentiation? Is your culture able to support empiricism? The answer to both of these has to be &lsquo;yes&rsquo; in order to commit to being data-driven.</p></div></div><div class=comment-level-1 id=comment-623><div class=comment-header><a href=#comment-623><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-08-29 02:43:41</small></p></a></div><div class="comment-body post-content"><p>Thank you for your thoughtful comments, Eric!</p><p>I generally agree that it can be beneficial to involve data scientists early on and to avoid thoughtless heuristics, but that it all depends on having a supportive data-driven environment and on resource constraints. As mentioned under Q2, getting advice from a data scientist in the early stages of the product is worthwhile, so it may be smart to pay for a few days of consulting, but not necessarily a good idea to hire a full-timer. A lot of it depends on the general product vision.</p><p>Another note regarding heuristics and intuition: While some may be dangerous, you can view many modelling decisions as heuristics. For example, when building a predictive model, you have to make some intuition-driven choices around features (no model uses all the knowledge in the world), learning algorithms and their hyperparameters. You just can&rsquo;t test everything, so there&rsquo;s a need for compromises if you aim to ever deliver anything.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/10/02/the-wonderful-world-of-recommender-systems/index.html b/2015/10/02/the-wonderful-world-of-recommender-systems/index.html
index 7e7b87782..5b150e39e 100644
--- a/2015/10/02/the-wonderful-world-of-recommender-systems/index.html
+++ b/2015/10/02/the-wonderful-world-of-recommender-systems/index.html
@@ -18,7 +18,7 @@
 100vw" srcset="https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/giveable-logo.png 250w," src=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/giveable-logo.png alt="Giveable logo" loading=lazy></a></figure><p>At the time, there wasn&rsquo;t much published research on gift recommendation, and there was more or less nothing about the specific problem of recommending gifts for Facebook friends using liked pages. Here are some of the ways this problem differs from classic recommendation scenarios.</p><ul><li><em>Need to consider giver and receiver.</em> Unlike traditional scenarios, the recommended items aren&rsquo;t consumed by the user to whom they&rsquo;re shown. In practice, this meant that we had to ensure the items are <em>giftable</em>, and take into account the relationship between the giver and the receiver. For example, the type of gift your mum may give you is different from gifts your partner may give you.</li><li><em>Likes are historical, sparse, and often nonsensical.</em> This is best illustrated by an example: What does liking a page such as <a href=https://www.facebook.com/Tony-Abbott-Worst-PM-in-Australian-History-151576228341304 target=_blank rel=noopener>Tony Abbott – Worst PM in Australian History</a> tell us about gifts the user may like? Tony Abbott is no longer prime minister (thankfully), so it&rsquo;s historical, and while this page is quite popular, there are many other pages out there that are difficult to interpret and are liked by only a handful of people (<a href="https://www.youtube.com/watch?v=c3IaKVmkXuk" target=_blank rel=noopener>this video is a good summary of why Tony is disliked, for those who are unfamiliar with Australian politics</a>).</li><li><em>Likes are not for recommended items.</em> As the above example shows, just because you like disliking Tony, it doesn&rsquo;t exactly lead to useful gifts. Even with things that are more related to interests, such as authors and bands, the liked pages aren&rsquo;t recommendable as gifts.</li><li><em>Likes are not always available offline.</em> This was an important engineering consideration: We didn&rsquo;t have much time to generate recommendations from the point where a new user gave us permission to view their likes and the likes of their friends. Ideally, recommendation generation would take less than a second from the time we got all the data from Facebook. This puts a strong constraint on the types of algorithms we could use.</li></ul><p>The key to effectively addressing the Giveable recommendation problem was doing as much processing offline as possible. Specifically:</p><ul><li>Similar pages were inferred using <a href=https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation target=_blank rel=noopener>Latent Dirichlet Allocation</a> (which can be seen as a collaborative filtering technique). This made it possible to use information from pages that are not directly linked to giftable products, e.g., for the above Tony Abbott example, people who dislike him are likely to be left-leaning, which implies many other interests.</li><li>Facebook pages were matched to giftable products with heuristics + <a href=https://www.mturk.com/ target=_blank rel=noopener>Mechanical Turk</a> + machine learning. This took a few iterations of what was essentially partly-manual <a href=https://en.wikipedia.org/wiki/Semi-supervised_learning target=_blank rel=noopener>semi-supervised learning</a>, where we obtained high-confidence matches through heuristics and manual tagging, and then used this to train a classifier that was used to classify uncertain matches. The results of classification on a hold-out set were then verified through manual tagging of subsamples.</li><li>We enriched the page and product data with structured information from the Freebase knowledge graph (<a href=https://plus.google.com/109936836907132434202/posts/3aYFVNf92A1 target=_blank rel=noopener>which has since been deprecated</a>). This allowed us to easily match giftable products to liked pages, e.g., books to authors.</li></ul><p>The online part included taking a receiver&rsquo;s liked pages, inferring likes for similar pages, and matching all these pages to a ranked and diversified list of giftable product recommendations. These recommendations came with explanations, which were quite important in this case because the giver of a gift has to know why they&rsquo;re giving it.</p><p class=highlight-box><b>The silver bullet myth</b><br>Optimising a single measure or using a single algorithm is sufficient for generating a good recommendation list<br><b>Reality</b><br>Hybrids work best</p><p>Netflix provides another example for how focusing on a single algorithm or measure of success is far from sufficient. In a <a href=http://techblog.netflix.com/2015/04/learning-personalized-homepage.html target=_blank rel=noopener>recent blog post</a>, they talk about how they use multiple algorithms to optimise the order of different recommendation lists and each list&rsquo;s internal ranking, while considering device-specific UI constraints, relevance, engagement, diversity, business requirements, and more.</p><p>An example from my experience comes from Giveable (which ended up evolving into Hynt), where a single list was generated by mixing the outputs of the following recommendation approaches: contextual, direct likes, inferred likes, content-based, social, collaborative filtering of products, previously viewed items, and popular interests/products. The weight of each algorithm in the mix was static – it was either set manually or through A/B testing, and then left as a hardcoded constant.</p><p>This kind of static mix can get you very far, but there&rsquo;s a better way that I haven&rsquo;t gotten around to implementing before leaving to work on other things. This way is described in <a href=http://engineering.richrelevance.com/bandits-recommendation-systems/ target=_blank rel=noopener>a series of posts on bandits for recommenders by Sergey Feldman of RichRelevance</a>. The general idea is to train recommendation models offline using a small number of strategies/paradigms. Online, recommendations are served from strategies that maximise clickthrough and revenue, given a context of features that describe the user, merchant, and web page where the RichRelevance widget is embedded. Rather than setting static weights for the strategies, the bandit model continuously adjusts the weights, while balancing between exploring new strategy weights and exploiting strategies that have been known to work well in a specific context. This allows the overall recommendation engine to adjust to changes in reality and in the underlying data.</p><p class=highlight-box><b>The omnipresence myth</b><br>Every personalised system is a recommender system<br><b>Reality</b><br>This one is kinda true, but not necessarily useful...</p><p>The first conference I attended as a PhD student was the 18th International Conference on User Modeling, Adaptation and Personalization (UMAP), back in 2010. The field of recommender systems was getting increased attention, and <a href=https://en.wikipedia.org/wiki/Peter_Brusilovsky target=_blank rel=noopener>Peter Brusilovsky</a>, who has been working in the UMAP field for decades, argued that recommender systems are the new <a href=https://en.wikipedia.org/wiki/Expert_system target=_blank rel=noopener>expert systems</a>. This was partly because the hype was causing people to broaden the definition of the field to allow them to say that they&rsquo;re working on recommender systems.</p><p>I don&rsquo;t think it&rsquo;s incorrect that personalisation and recommender systems are different things. However, one problem that this may cause is making people think that common recommendation techniques would apply in scenarios where they&rsquo;re unlikely to work. For example, web search can be seen as a recommender system for pages that gives a high weight to the user&rsquo;s intent, as captured by the query. Hence, when personalising web search, it seems sensible to use collaborative filtering techniques. This was indeed my experience with <a href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/>the Yandex search personalisation competition</a>: employing a matrix factorisation approach that was inspired by collaborative filtering turned out to be a waste of time compared to domain-specific methods.</p><p><strong>In conclusion</strong>, recommenders are about as murky as data science. Just like data science, the boundaries of recommender systems are hard to define and they are sometimes over-hyped. This hype may lead to people investing in a recommender system they don&rsquo;t really need, just like <a href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/>the common issue of premature investment in data science</a>. However, the hype is based on real value, which can definitely be delivered by recommender systems when they are used correctly.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/recommender-systems/>Recommender Systems</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on x" href="https://x.com/intent/tweet/?text=The%20wonderful%20world%20of%20recommender%20systems&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f&amp;hashtags=datascience%2cmachinelearning%2cpredictivemodelling%2crecommendersystems%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f&amp;title=The%20wonderful%20world%20of%20recommender%20systems&amp;summary=The%20wonderful%20world%20of%20recommender%20systems&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f&title=The%20wonderful%20world%20of%20recommender%20systems"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on whatsapp" href="https://api.whatsapp.com/send?text=The%20wonderful%20world%20of%20recommender%20systems%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on telegram" href="https://telegram.me/share/url?text=The%20wonderful%20world%20of%20recommender%20systems&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The wonderful world of recommender systems on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20wonderful%20world%20of%20recommender%20systems&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f02%2fthe-wonderful-world-of-recommender-systems%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-702><div class=comment-header><a href=#comment-702><img class=comment-avatar src="https://www.gravatar.com/avatar/54f70981ce3758ccbfd5420bbff28df6?s=50"><p class=comment-info><strong>Zygmunt</strong><br><small>2015-10-02 16:51:09</small></p></a></div><div class="comment-body post-content"><p>How did you arrive at the conclusion that accuracy doesn&rsquo;t matter? The Netflix quote and chart don&rsquo;t seem connected to me. The quote refers to the massive ensembling used to achieve the challenge score threshold. The chart seems to show that you can go a long way from the baseline by improving features and models.</p><p>I&rsquo;d say accuracy, or more generally, score used for evaluation, doesn&rsquo;t matter <em>as long as it&rsquo;s good enough</em>. However, it&rsquo;s not that easy to arrive at &ldquo;good enough&rdquo;. Consider Spotify. I find their daily recommendations abysmal. Discover Weekly is much better, but still has room to improve.</p></div></div><div class=comment-level-1 id=comment-706><div class=comment-header><a href=#comment-706><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-10-02 20:36:57</small></p></a></div><div class="comment-body post-content"><p>I agree. I said that predictive accuracy has some importance, but it is not the only thing that matters. You&rsquo;re right about it needing to be <em>good enough</em>, where the definition of good enough is domain-dependent.</p><p><a href=http://cgi.csc.liv.ac.uk/~wda2001/Panel_Presentations/Lopresti/Lopresti_files/v3_document.htm target=_blank rel=nofollow>Daniel Lopresti</a> said it well years ago (he spoke about web search but it applies to recommendation scenarios where suggestions are browsable):</p><blockquote>Browsing is a comfortable and powerful paradigm (the serendipity effect). Search results don't have to be very good.
 Recall? Not important (as long as you get at least some good hits).
 Precision? Not important (as long as at least some of the hits on the first page you return are good).</blockquote></div></div><div class=comment-level-2 id=comment-1037><div class=comment-header><a href=#comment-1037><img class=comment-avatar src="https://www.gravatar.com/avatar/b3fdb35b6d4d527be6adef9dd103e352?s=50"><p class=comment-info><strong>flower</strong><br><small>2016-01-26 18:47:02</small></p></a></div><div class="comment-body post-content">it does not clear to me why accuracy is not important in recommender and searching??</div></div><div class=comment-level-3 id=comment-1038><div class=comment-header><a href=#comment-1038><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-01-26 19:21:55</small></p></a></div><div class="comment-body post-content"><p>It is important, but its importance tends to be exaggerated to the exclusion of all other metrics. As I said in the post, things like the way you present your results (UI/UX) and novelty/serendipity are also very important. In addition, the goal of the system is often to optimise a different goal from offline accuracy, such as revenue or engagement. In such cases it is best to focus on what you want to improve rather than offline accuracy.</p><p>By the way, I attended a talk by Ted Dunning a few months ago, where he said that one of the most important tweaks in real-life recommender system is adding random recommendations (essentially decreasing offline accuracy). This allows the system to learn from user feedback on a wider range of items, improving performance in the long run.</p></div></div><div class=comment-level-2 id=comment-1040><div class=comment-header><a href=#comment-1040><img class=comment-avatar src="https://www.gravatar.com/avatar/b3fdb35b6d4d527be6adef9dd103e352?s=50"><p class=comment-info><strong>flower</strong><br><small>2016-01-26 21:00:40</small></p></a></div><div class="comment-body post-content">Thank you very much for your fast response.
diff --git a/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/index.html b/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/index.html
index d803111d7..81bd266b5 100644
--- a/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/index.html
+++ b/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/index.html
@@ -5,7 +5,7 @@
 https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/phd-comics-science-news-cycle.gif 600w," src=https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/phd-comics-science-news-cycle.gif alt="PHD Comics: Science News Cycle" loading=lazy></a></figure><h3 id=selling-your-model-with-simple-explanations>Selling your model with simple explanations<a hidden class=anchor aria-hidden=true href=#selling-your-model-with-simple-explanations>#</a></h3><p>People like simple explanations for complex phenomena. If you work as a data scientist, or if you are planning to become/hire one, you&rsquo;ve probably seen <strong>storytelling</strong> listed as one of the key skills that data scientists should have. Unlike &ldquo;real&rdquo; scientists that work in academia and have to explain their results mostly to peers who can handle technical complexities, data scientists in industry have to deal with non-technical stakeholders who want to understand how the models work. However, these stakeholders rarely have the time or patience to understand how things truly work. What they want is a simple hand-wavy explanation to make them <em>feel</em> as if they understand the matter – they want a <em>story</em>, not a technical report (an aside: don&rsquo;t feel too smug, there is a lot of knowledge out there and in matters that fall outside of our main interests we are all non-technical stakeholders who get fed simple stories).</p><p>One of the simplest stories that most people can understand is the story of <strong>correlation</strong>. Going back to the running example of predicting health based on diet, it is well-known that excessive consumption of certain fats under certain conditions is correlated with an increase in likelihood of certain diseases. This is simplified in some stories to &ldquo;consuming more fat increases your chance of disease&rdquo;, which leads to the conclusion that consuming no fat at all decreases the chance of disease to zero. While this may sound ridiculous, it&rsquo;s the sad reality. According to <a href=http://www.foodinsight.org/2015-food-health-survey-consumer-research target=_blank rel=noopener>a recent survey</a>, while the image of fat has improved over the past few years, 42% of Americans still try to limit or avoid all fats.</p><p>A slightly more involved story is that of <strong>linear models</strong> – looking at the effect of the most important factors, rather than presenting a single factor&rsquo;s contribution. This storytelling technique is commonly used even with non-linear models, where the most important features are identified using various techniques. The problem is that people still tend to interpret this form of presentation as a simple linear relationship. Expanding on the previous example, this approach goes from a single-minded focus on fat to the need to consume less fat and sugar, but more calcium, protein and vitamin D. Unfortunately, even linear models with tens of variables are hard for people to use and follow. In the case of nutrition, few people really track the intake of all the nutrients covered by recommended daily intakes.</p><h3 id=few-interesting-relationships-are-linear>Few interesting relationships are linear<a hidden class=anchor aria-hidden=true href=#few-interesting-relationships-are-linear>#</a></h3><p>Complex phenomena tend to be explained by complex non-linear models. For example, it&rsquo;s not enough to consume the &ldquo;right&rdquo; amount of calcium – <a href=https://en.wikipedia.org/wiki/Calcium#Nutrition target=_blank rel=noopener>you also need vitamin D to absorb it</a>, but <a href=http://www.medicaldaily.com/vitamin-d-benefits-are-enhanced-if-meal-contains-fat-absorbing-more-supplements-311248 target=_blank rel=noopener>popping a few vitamin D pills isn&rsquo;t going to work well if you don&rsquo;t consume them with fat</a>, though <a href=https://en.wikipedia.org/wiki/Trans_fat#Health_risks target=_blank rel=noopener>over-consumption of certain fats is likely to lead to health issues</a>. This list of human-friendly rules can go on and on, but reality is much more complex. It is naive to think that it is possible to predict something as complex as human health with a simple linear model that is based on daily nutrient intake. That being said, some relationships do lend themselves to simple rules of thumb. For example, if you don&rsquo;t have enough vitamin C, you&rsquo;re very likely to get <a href=https://en.wikipedia.org/wiki/Scurvy target=_blank rel=noopener>scurvy</a>, and people who don&rsquo;t consume enough vitamin B1 may contract <a href=https://en.wikipedia.org/wiki/Beriberi target=_blank rel=noopener>beriberi</a>. However, when it comes to cancers and other diseases that take years to develop, linear models are inadequate.</p><p>An accurate model to predict human health based on diet would be based on thousands to millions of variables, and would consider many non-linear relationships. It is fairly safe to assume that there is no magic bullet that simply explains how diet affects our health, and no <a href=https://en.wikipedia.org/wiki/Superfood target=_blank rel=noopener>superfood</a> is going to save us from the complexity of our nutritional needs. It is likely that even if we had such a model, it would not be completely accurate. All models are wrong, but some models are useful. For example, the vitamin C versus scurvy model is very useful, but it is often wrong when it comes to predicting overall health. Predictions made by useful complex models can be very hard to reason about and explain, but it doesn&rsquo;t mean we shouldn&rsquo;t use them.</p><h3 id=the-ongoing-quest-for-sellable-complex-models>The ongoing quest for sellable complex models<a hidden class=anchor aria-hidden=true href=#the-ongoing-quest-for-sellable-complex-models>#</a></h3><p>All of the above should be pretty obvious to any modern data scientist. The culture of preferring complex models with high predictive accuracy to simplistic models with questionable predictive power is now prevalent (see <a href=http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 target=_blank rel=noopener>Leo Breiman&rsquo;s 2001 paper for a discussion of these two cultures of statistical modelling</a>). This is illustrated by the focus of many <a href=https://www.kaggle.com/ target=_blank rel=noopener>Kaggle</a> competitions on producing accurate models and the recent successes of <a href=https://en.wikipedia.org/wiki/Deep_learning#Image_recognition target=_blank rel=noopener>deep learning for computer vision</a>. Especially with deep learning for vision, no one expects a handful of variables (pixels) to be predictive, so traditional explanations of variable importance are useless. This does lead to a general suspicion of such models, as they are too complex for us to reason about or fully explain. However, it is very hard to argue with the empirical success of accurate modelling techniques.</p><p>Nonetheless, many data scientists still work in environments that require simple explanations. This may lead some data scientists to settle for simple models that are easier to sell. In my opinion, it is better to make up a simple explanation for an accurate complex model than settle for a simple model that doesn&rsquo;t really work. That being said, some situations do call for simple or inflexible models due to a lack of data or the need to enforce strong prior assumptions. In Albert Einstein&rsquo;s words, &ldquo;it can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience&rdquo;. Make things as simple as possible, but not simpler, and always consider the interests of people who try to sell you simplistic (or unnecessarily complex) explanations.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-business/>Data Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/health/>Health</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/nutrition/>Nutrition</a></li><li><a href=https://yanirseroussi.com/tags/nutritionism/>Nutritionism</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on x" href="https://x.com/intent/tweet/?text=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f&amp;hashtags=databusiness%2cdatascience%2chealth%2cmachinelearning%2cnutrition%2cnutritionism%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f&amp;title=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling&amp;summary=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f&title=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on whatsapp" href="https://api.whatsapp.com/send?text=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on telegram" href="https://telegram.me/share/url?text=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling on ycombinator" href="https://news.ycombinator.com/submitlink?t=Miscommunicating%20science%3a%20Simplistic%20models%2c%20nutritionism%2c%20and%20the%20art%20of%20storytelling&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f10%2f19%2fnutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/index.html b/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/index.html
index daad58333..9329cf858 100644
--- a/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/index.html
+++ b/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/index.html
@@ -10,7 +10,7 @@
 https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/bcrecommender-search-amanda.png 1202w," src=https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/bcrecommender-search-amanda_hu0232655e13350a3ced90992236bca155_32800_800x0_resize_box_3.png alt="bcrecommender search for amanda" loading=lazy></a></figure><p>I deployed full-text search earlier this week, and so far it&rsquo;s looking pretty good. Elasticsearch seems to be coping well with having the same level of resources allocated as before, but it&rsquo;s still too early to tell if this is sustainable over time. Most importantly, users are finally seeing results when they enter unstructured queries, which increases their engagement and retention. Woohoo!</p><p><strong>Improving search performance based on user behaviour</strong> is expected to be an ongoing effort. Despite having many ideas, I resisted the temptation of endless offline tinkering and opted to release a working search page quickly. With <a href="https://support.google.com/analytics/answer/1012264?hl=en" target=_blank rel=noopener>Google Analytics now set up to track site search</a>, the plan is keep identifying gaps and tweak the search settings continuously. This will take a while, as the number of daily users is currently 200-300, and they don&rsquo;t all use site search.</p><p><strong>Implementing more search features</strong> is another set of items on my to-do list that will be addressed over time. For example, it&rsquo;d be great to have search auto-completion and a prettier result page. However, I have more ideas than time to implement them, and I&rsquo;m not working on BCRecommender full-time. For now, I&rsquo;m pretty happy with finally having the search function.</p><h2 id=es-vs-mongo>Elasticsearch versus MongoDB: Key findings<a hidden class=anchor aria-hidden=true href=#es-vs-mongo>#</a></h2><p>Comparisons between tools should always be taken with a grain of salt. General comparisons may not address features that are important for your specific use case, or may overemphasise aspects that you don&rsquo;t care about. In addition, actively developed tools are moving targets. Since I started the transition to Elasticsearch, version 2.0 has been released, and MongoDB 3.2 is expected very soon. The following list is derived from my experience and may not apply to you. You have been warned!</p><p>With the disclaimer out of the way, here are some of the advantages of Elasticsearch over MongoDB:</p><ul><li>Better full-text search support (duh!).</li><li>Enforceable schemas and type validation (<em>note:</em> some form of optional schema is expected in MongoDB 3.2).</li><li>All fields are indexed by default, making it easy to explore unstructured data without worrying about adding indices.</li><li>It appears that indexing is <a href=https://www.elastic.co/guide/en/elasticsearch/guide/current/translog.html target=_blank rel=noopener>implemented in a more efficient way that doesn&rsquo;t block the node</a>. Slowness due to indexing operations seems to be a common issue with MongoDB, even with background index creation.</li><li>It&rsquo;s possible to query multiple indices and types (same as MongoDB databases & collections, respectively) in the same query. This is a huge advantage in my case as it makes it possible to efficiently search both fans and tralbums in a single query.</li><li><a href=https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html target=_blank rel=noopener>Index aliases</a> make it easy to change the indices without changing the application.</li><li>Multi-get by IDs returns results in the order they were requested. This is not the case with MongoDB, <a href=http://stackoverflow.com/questions/22797768/does-mongodbs-in-clause-guarantee-order/22800784#22800784 target=_blank rel=noopener>where using $in doesn&rsquo;t have any guarantees on the returned documents&rsquo; order</a>. It&rsquo;s easy to work around this issue, but it can be the source of subtle bugs. In my case, recommendations were unintentionally sorted in random order until I added an additional step to sort them correctly.</li><li>Built-in support for <a href=https://www.elastic.co/guide/en/elasticsearch/guide/current/random-scoring.html target=_blank rel=noopener>random scoring</a> (<em>note:</em> random sampling will finally be available in MondoDB 3.2 – <a href=https://jira.mongodb.org/browse/SERVER-533 target=_blank rel=noopener>the ticket for this has been open for 5 years</a>).</li><li>Built-in support for <a href=https://www.elastic.co/guide/en/elasticsearch/guide/current/most-fields.html target=_blank rel=noopener>multiple types of analysis on the same field</a>.</li></ul><p>Some disadvantages of Elasticsearch in comparison to MongoDB are:</p><ul><li>All fields are indexed by default, making it easy to run into memory issues. Adjusting these default settings is strongly recommended if you know how you&rsquo;re going to query the data.</li><li>Documents are immutable, so every update requires deleting the original document and re-inserting it (in practice, it seems like this isn&rsquo;t much of an issue).</li><li>Sorting results by a field requires reading all the field&rsquo;s values and sorting them in memory. The sorted results are cached, but this may cause issues if memory is too limited.</li></ul><p>In conclusion, my experience with Elasticsearch has been mostly positive so far and I&rsquo;m glad I&rsquo;ve made the switch. I&rsquo;m looking forward to taking further advantage of advanced search features to improve user experience on BCRecommender. New posts on the topic may be published in the future, so please subscribe to be notified when this happens. As always, I&rsquo;m happy to receive feedback through the comments or <a href=https://yanirseroussi.com/about/>privately</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/elasticsearch/>Elasticsearch</a></li><li><a href=https://yanirseroussi.com/tags/mongodb/>MongoDB</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on x" href="https://x.com/intent/tweet/?text=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f&amp;hashtags=BCRecommender%2cDevOps%2cElasticsearch%2cMongoDB%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f&amp;title=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch&amp;summary=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f&title=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on whatsapp" href="https://api.whatsapp.com/send?text=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on telegram" href="https://telegram.me/share/url?text=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating a simple web application from MongoDB to Elasticsearch on ycombinator" href="https://news.ycombinator.com/submitlink?t=Migrating%20a%20simple%20web%20application%20from%20MongoDB%20to%20Elasticsearch&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f04%2fmigrating-a-simple-web-application-from-mongodb-to-elasticsearch%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2015/11/23/the-hardest-parts-of-data-science/index.html b/2015/11/23/the-hardest-parts-of-data-science/index.html
index 613585c86..87d711844 100644
--- a/2015/11/23/the-hardest-parts-of-data-science/index.html
+++ b/2015/11/23/the-hardest-parts-of-data-science/index.html
@@ -9,7 +9,7 @@
 https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/dilbert-big-data.jpg 904w," src=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/dilbert-big-data_hu52b7900d669ddfe03bdd894fda67e436_330047_800x0_resize_q75_box.jpg alt="Dilbert big data" loading=lazy></a></figure><p>Well-defined problems are great, for the obvious reason that they can actually be addressed. Examples of such problems include:</p><ul><li>Build a model to predict the sales of a marketing campaign</li><li>Create a system that runs campaigns that automatically adapt to customer feedback</li><li>Identify key objects in images</li><li>Improve click-through rates on search engine results, ads, or any other element</li><li>Detect whale calls from underwater recordings to prevent collisions</li></ul><p>Often, it can be hard to get to the stage where the problem is agreed on, because this requires dealing with people who only have a fuzzy idea of what can be done with data science. Dilbertian situations aside, these people often have real problems that they care about, so exploring the core issues with them is time well-spent.</p><h2 id=solution-measurement-is-often-harder-than-problem-definition>Solution measurement is often harder than problem definition<a hidden class=anchor aria-hidden=true href=#solution-measurement-is-often-harder-than-problem-definition>#</a></h2><p>Many problems that actually matter have solutions that are really hard to measure. For example, improving the well-being of the population (e.g., a company&rsquo;s customers or a country&rsquo;s citizens) is an overarching problem that arises in many situations. However, this problem gives rise to the hard question of how well-being can be measured and aggregated. The following paragraphs discuss issues that occur in solution measurement, often making it the hardest part of data science.</p><p>Ideally, we would always be able to run <a href=https://en.wikipedia.org/wiki/Randomized_controlled_trial target=_blank rel=noopener>randomised controlled trials</a> to measure treatment effects. However, the reality is that <a href=https://en.wikipedia.org/wiki/Censoring_%28statistics%29 target=_blank rel=noopener>experimental data is often censored</a>, there many constraints on running experiments (ethics, practicality, budget, etc.), and <a href=https://en.wikipedia.org/wiki/Confounding target=_blank rel=noopener>confounding factors</a> may make it impossible to identify the true causal impact of interventions. These issues seriously influence many aspects of our lives. I&rsquo;ve <a href=https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/>written a post on how these issues manifest themselves in research on the connection between nutrition and our health</a>. Here, I&rsquo;ll discuss another major example: the health effects of smoking and anthropogenic climate change.</p><p>While smoking and anthropogenic climate change may seem unrelated, they actually have a lot in common. In both cases it is hard (or impossible) to perform experiments to determine causality, and in both cases <a href=https://en.wikipedia.org/wiki/Merchants_of_Doubt target=_blank rel=noopener>this fact has been used to mislead the public by parties with commercial and ideological interests</a>. In the case of smoking, due to ethical reasons, one can&rsquo;t perform an experiment where a random control group is forced not to smoke, while a treatment group is forced to smoke. Further, since it can take many years for smoking-caused diseases to develop, it&rsquo;d take a long time to obtain the results of such an experiment. Tobacco companies have exploited this fact for years, claiming that there may be some genetic factor that causes both smoking and a higher susceptibility to smoking-related diseases. Fortunately, we live in a world where these claims have been widely discredited, and it is now clear to most people that smoking is harmful. However, similar doubt-casting techniques are used by polluters and their supporters in the debate on anthropogenic climate change. While no serious climate scientist doubts the fact that human activities are causing climate change, this can&rsquo;t be proved through experimentation on another Earth. In both cases, the answers should be clear when looking at the evidence and the mechanisms at play without an ideological bias. It doesn&rsquo;t take a scientist to figure out that pumping your lungs full of smoke on a regular basis is likely to be harmful, as is pumping the atmosphere full of greenhouse gases that have been sequestered for millions of years. However, as said by Upton Sinclair, &ldquo;it is difficult to get a man to understand something, when his salary depends upon his not understanding it.&rdquo;</p><p>Assuming that we have addressed the issues raised so far, there is the matter of choosing a measure or metric of success. How do we know that our solution works well? A common approach is to choose a single metric to focus on, such as increasing conversion rates. However, all metrics have their flaws, and there are quite a few problems with metric selection and its maintenance over time.</p><p>First, <strong>focusing on a single metric can be harmful</strong>, because no metric is perfect. A classic example of this issue is the focus on growing the economy, as measured by <a href=https://en.wikipedia.org/wiki/Gross_domestic_product target=_blank rel=noopener>gross domestic product (GDP)</a>. The article <a href=https://mises.org/library/what-gdp target=_blank rel=noopener>What is up with the GDP?</a> by Frank Shostak summarises some of the problems with GDP:</p><blockquote><p>The GDP framework cannot tell us whether final goods and services that were produced during a particular period of time are a reflection of real wealth expansion, or a reflection of capital consumption.</p><p>For instance, if a government embarks on the building of a pyramid, which adds absolutely nothing to the well-being of individuals, the GDP framework will regard this as economic growth. In reality, however, the building of the pyramid will divert real funding from wealth-generating activities, thereby stifling the production of wealth.</p><p>[&mldr;]</p><p>The whole idea of GDP gives the impression that there is such a thing as the national output. In the real world, however, wealth is produced by someone and belongs to somebody. In other words, goods and services are not produced in totality and supervised by one supreme leader. This in turn means that the entire concept of GDP is devoid of any basis in reality. It is an empty concept.</p></blockquote><p>Shostak&rsquo;s criticism comes from a right-winged viewpoint – his argument is that the GDP is used as an excuse for unnecessary government intervention with the market. However, the focus on GDP growth is also heavily-criticised by the left due to the fact that it doesn&rsquo;t consider environmental effects and inequalities in the distribution of wealth. It is a bit odd that GDP growth is still considered a worthwhile goal by many people, given that it can easily be skewed by a few powerful individuals who choose to build unnecessary pyramids (though perhaps this is the real reason why the GDP persists – wealthy individuals have an interest in keeping it this way).</p><p>Even if we decide to use <strong>multiple metrics</strong> to evaluate our solution, our troubles aren&rsquo;t over yet. Using multiple metrics often means that there are trade-offs between the different metrics. For example, with the <a href=https://en.wikipedia.org/wiki/Precision_and_recall target=_blank rel=noopener>precision and recall</a> measures that are commonly used to evaluate the performance of search engines, it is rare to be able to increase both precision and recall at the same time. <em>Precision</em> is the percentage of relevant items out of those that have been returned, while <em>recall</em> is the percentage of relevant items that have been returned out of the overall number of relevant items. Hence, it is easy to artificially increase recall to 100% by always returning all the items in the database, but this would mean settling for near-zero precision. Similarly, one can increase precision by always returning a single item that the algorithm is very confident about, but this means that recall would suffer. Ultimately, the best balance between precision and recall depends on the application.</p><p>Another issue with choosing metrics is the impossibility of reliably evaluating our choices. This is summarised well by Scott Berkun in his book <a href=http://scottberkun.com/yearwithoutpants/ target=_blank rel=noopener>The Year Without Pants</a>:</p><blockquote><p>All metrics create temptations. Even with great intentions and smart minds, data runs you faster and faster into a stupid self-destructive circle. Data can&rsquo;t decide things for you. It can help you see things more clearly if captured carefully, but that&rsquo;s not the same as deciding. Just as there is an advice paradox, there is a data paradox: no matter how much data you have, you still depend on your intuition for deciding how to interpret and then apply the data.</p><p>Put another way, there is no good KPI for measuring KPIs. There are no good metrics for evaluating metrics (or for evaluating metrics for evaluating metrics for evaluating metrics, and on it goes).</p></blockquote><p>OK, so we&rsquo;ve picked some flawed measures that we can&rsquo;t really evaluate, and we&rsquo;ve accepted the imperfections of the evaluation process. Are we done yet? No. There&rsquo;s still the small matter of <a href="https://en.wikipedia.org/wiki/Goodhart's_law" target=_blank rel=noopener>Goodhart&rsquo;s Law</a>, which states that &ldquo;when a measure becomes a target, it ceases to be a good measure.&rdquo; This is often the case because people will tend to manipulate results and game the system (not necessarily maliciously) in order to hit measured goals. However, even without manipulation and gaming, we often deal with moving targets. Just because the measure we&rsquo;ve chosen is suitable today, it doesn&rsquo;t mean it will still be relevant in a few months or years because reality changes. For example, in the 1990s, the number of page views was a good measure of interaction with websites, but nowadays it is a pretty weak measure because many websites are single-page applications. Reality changes and so should our problems, solutions, measures, and goals.</p><h2 id=embracing-ambiguity-and-uncertainty>Embracing ambiguity and uncertainty<a hidden class=anchor aria-hidden=true href=#embracing-ambiguity-and-uncertainty>#</a></h2><p>Personally, I find the complexities of measurement and problem definition quite interesting. However, many people aren&rsquo;t that interested in this stuff – they just want working solutions and simple stories. As demonstrated by the examples throughout this article, over-simplification of complicated matters is a pervasive issue that goes beyond what&rsquo;s commonly considered &ldquo;data science&rdquo;. This is why storytelling is seen as a key skill that data scientists should possess. I believe it&rsquo;s also important to maintain one&rsquo;s integrity and not just make up stories that people would buy, but it&rsquo;d be naive to assume that this never happens. Either way, good data scientists embrace uncertainty and ambiguity, but can still tell a simple story if needed.</p><p><small><b>Note:</b> The ideas in this post were first presented at <a href=http://www.meetup.com/The-Sydney-Data-Science-Breakfast-Meetup-Group/ target=_blank rel=noopener>The Sydney Data Science Breakfast Meetup Group</a>. The slides for that talk are available <a href=http://yanirs.github.io/talks/the-hardest-part-of-data-science target=_blank rel=noopener>here</a>.</small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/climate-change/>Climate Change</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/science-communication/>Science Communication</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on x" href="https://x.com/intent/tweet/?text=The%20hardest%20parts%20of%20data%20science&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f&amp;hashtags=climatechange%2cdatascience%2cKaggle%2cpredictivemodelling%2csciencecommunication"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f&amp;title=The%20hardest%20parts%20of%20data%20science&amp;summary=The%20hardest%20parts%20of%20data%20science&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f&title=The%20hardest%20parts%20of%20data%20science"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on whatsapp" href="https://api.whatsapp.com/send?text=The%20hardest%20parts%20of%20data%20science%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on telegram" href="https://telegram.me/share/url?text=The%20hardest%20parts%20of%20data%20science&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The hardest parts of data science on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20hardest%20parts%20of%20data%20science&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f11%2f23%2fthe-hardest-parts-of-data-science%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-881><div class=comment-header><a href=#comment-881><img class=comment-avatar src="https://www.gravatar.com/avatar/52588113ec3afb6f58179544ba5f23df?s=50"><p class=comment-info><strong>RG</strong><br><small>2015-11-24 02:37:44</small></p></a></div><div class="comment-body post-content">Excellent excellent common sense article which seems to be very uncommon nowadays in a hype-filled world. Thanks for reminding us that accurate problem description should trump everything else!</div></div><div class=comment-level-0 id=comment-897><div class=comment-header><a href=#comment-897><img class=comment-avatar src="https://www.gravatar.com/avatar/db015de93a5653d51295570f25580bc7?s=50"><p class=comment-info><strong>Dalila</strong><br><small>2015-11-30 14:50:28</small></p></a></div><div class="comment-body post-content"><p>Thank you for a great article. Yes, well defined problems and well defined performance evaluation are keys to designing any data driven model.</p><p>I also found that sometimes we have the question we want to pursue, but getting to an answer is not straight forward. For instance, I&rsquo;m trying to find
 affinity between food ingredients using only data analytics. One may think that this problem is trivial. In fact, to find this, one has to totally rethink how to represent data (having ingredients in a table or a dataset produced nothing.) Yes finding affinity between 2 ingredients is trivial, but when the number grows up, one has to change the setting. In my case, I had to think of ingredients as part of a complete network, where the network is a recipe. It is then and only then, that I was able to find affinity between many ingredients.</p></div></div><div class=comment-level-0 id=comment-905><div class=comment-header><a href=#comment-905><img class=comment-avatar src="https://www.gravatar.com/avatar/9a6fa828e494e8583e5450e34f0d999a?s=50"><p class=comment-info><strong>andrew</strong><br><small>2015-12-04 00:29:33</small></p></a></div><div class="comment-body post-content">Yes, many good points here, thanks for this. There is even another difficulty apart from problem definition and solution measurement: the semantics of the data itself. Are the definitions real (referring to other concepts) or nominal (&ldquo;a cheeseburger is a burger with cheese&rdquo;)? Scope and context can easily be lost, and can only be put back by a human being taking a decision, no amount of empirical modelling can re-discover this. Also precision and accuracy of the data may be unknown and/or insufficient to solve the problem posed. If you have detected these issues, sometimes you can re-formulate the problem,but typically its not clear from the column headings alone (if you even have these). Even worse, the definitions may be incoherent or nonsensical: eg in classical econometric modelling the definition of a rational agent entails that the agent have knowledge of the future!</div></div><div class=comment-level-1 id=comment-913><div class=comment-header><a href=#comment-913><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-12-06 05:44:13</small></p></a></div><div class="comment-body post-content">Thanks Andrew! I agree that often what you can do is very limited by the data. I&rsquo;ve also encountered cases where I had to infer meaning from cryptic column names. In many cases the small arbitrary decisions that we make along the way can have a major influence on the final results!</div></div><div class=comment-level-0 id=comment-919><div class=comment-header><a href=#comment-919><img class=comment-avatar src="https://www.gravatar.com/avatar/6a5c29f57bc8859bf771ca6b5980321a?s=50"><p class=comment-info><strong>Arthur</strong><br><small>2015-12-08 00:28:04</small></p></a></div><div class="comment-body post-content"><p>Is this article a Poe? The amount of muddled priors throughout it is disturbing. The word &ldquo;sophistry&rdquo; keeps leaping to mind. E.g.:</p><p>> For instance, if a government embarks on the building of a pyramid [&mldr;]
 s/&ldquo;a government&rdquo;/Paris/g
diff --git a/2015/12/08/this-holiday-season-give-me-real-insights/index.html b/2015/12/08/this-holiday-season-give-me-real-insights/index.html
index 1f755fa50..a78ef0cb9 100644
--- a/2015/12/08/this-holiday-season-give-me-real-insights/index.html
+++ b/2015/12/08/this-holiday-season-give-me-real-insights/index.html
@@ -14,7 +14,7 @@
 https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/linkedin-profile-views.png 983w," src=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/linkedin-profile-views_hu9bee2f11dc6b74171edf6cb72c9c5a93_38858_800x0_resize_box_3.png alt="LinkedIn profile views" loading=lazy></a></figure><p><strong>What would real LinkedIn insights look like?</strong> First, I think that the focus on profile views is somewhat misguided. It&rsquo;s not that hard to artificially generate profile views – simply view other people&rsquo;s profiles. There is no intrinsic value in someone having viewed your profile – the value comes from a connection that leads to an interesting offer or conversation. Second, LinkedIn is about professional networking that is based on real-world activity. As such, it only forms a small part of the world of professional networking by allowing people to have an online presence that makes them contactable by people they don&rsquo;t already know. When it comes to insights, it&rsquo;d be useful to know the true causal factors that lead to interesting connections – much more useful than suggestions such as <em>add software development as a skill on your profile to get up to 3% more profile views</em>.</p><h2 id=summary-real-insights-are-about-the-_why_>Summary: Real insights are about the <em>why</em><a hidden class=anchor aria-hidden=true href=#summary-real-insights-are-about-the-_why_>#</a></h2><p>There are many other examples of pseudo-insights out there. The reason is probably that the field of analytics is becoming increasingly commoditised, and it is easier to rebrand an analytics dashboard as an insights dashboard than to provide real insights. Providing real insights requires moving up the <a href=https://en.wikipedia.org/wiki/DIKW_Pyramid target=_blank rel=noopener>DIKW pyramid</a> from data and information to knowledge and wisdom – from describing the past to learning general lessons that allow you to influence the future. Providing real insights can be very hard, as it often requires inferring the causes of events – the <em>why</em> that comes after the <em>what</em> and <em>how</em>. More on this later – I have just started reading <a href=http://www.skleinberg.org/why/ target=_blank rel=noopener>Samantha Kleinberg&rsquo;s Why: A Guide to Finding and Using Causes</a> and will report (hopefully real) insights on causality in future posts.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/facebook/>Facebook</a></li><li><a href=https://yanirseroussi.com/tags/insights/>Insights</a></li><li><a href=https://yanirseroussi.com/tags/linkedin/>LinkedIn</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/wordpress/>WordPress</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on x" href="https://x.com/intent/tweet/?text=This%20holiday%20season%2c%20give%20me%20real%20insights&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f&amp;hashtags=analytics%2cdatascience%2cFacebook%2cinsights%2cLinkedIn%2cmarketing%2cWordPress"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f&amp;title=This%20holiday%20season%2c%20give%20me%20real%20insights&amp;summary=This%20holiday%20season%2c%20give%20me%20real%20insights&amp;source=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f&title=This%20holiday%20season%2c%20give%20me%20real%20insights"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on whatsapp" href="https://api.whatsapp.com/send?text=This%20holiday%20season%2c%20give%20me%20real%20insights%20-%20https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on telegram" href="https://telegram.me/share/url?text=This%20holiday%20season%2c%20give%20me%20real%20insights&amp;url=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share This holiday season, give me real insights on ycombinator" href="https://news.ycombinator.com/submitlink?t=This%20holiday%20season%2c%20give%20me%20real%20insights&u=https%3a%2f%2fyanirseroussi.com%2f2015%2f12%2f08%2fthis-holiday-season-give-me-real-insights%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-957><div class=comment-header><a href=#comment-957><img class=comment-avatar src="https://www.gravatar.com/avatar/915ee4efbd21f12b55b6362cf4f7c42f?s=50"><p class=comment-info><strong>Greg Ichneumon Brown</strong><br><small>2015-12-21 22:52:05</small></p></a></div><div class="comment-body post-content"><p>Nice post. Mostly agree. Automated insights are hard to automate though, but we (the WordPress.com Data Team) are working on it.</p><p>Some of the things we&rsquo;ve found that have the biggest impact on successful blogging are:</p><ul><li>Turning on publicize so your posts are pushed out to various social media channels</li><li>Regularly publishing. Doesn&rsquo;t have to be daily, but does need to be regular. We still don&rsquo;t understand how the periodicity plays into this.</li><li>Images in posts are correlated with more traffic.</li></ul><p>There&rsquo;s still a lot to learn here. Interested in helping? <a href=https://automattic.com/work-with-us/data-wrangler/ target=_blank rel=noopener>https://automattic.com/work-with-us/data-wrangler/</a> :)</p></div></div><div class=comment-level-1 id=comment-966><div class=comment-header><a href=#comment-966><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2015-12-27 19:48:26</small></p></a></div><div class="comment-body post-content"><p>Thanks Greg! All those factors make sense. Personally, I prefer sharing posts manually to turning on Publicize, but I suppose it has the same effect. My guess is that one of the reasons why images are important is that having at least one image makes posts stick out when shared on social media.</p><p>By the way, I did apply for the data wrangler position a couple of months ago but never heard back. It&rsquo;s probably too late now, as I have a different position (and a few other options) lined up when I get home from vacation next month :)</p></div></div><div class=comment-level-2 id=comment-967><div class=comment-header><a href=#comment-967><img class=comment-avatar src="https://www.gravatar.com/avatar/915ee4efbd21f12b55b6362cf4f7c42f?s=50"><p class=comment-info><strong>Greg Ichneumon Brown</strong><br><small>2015-12-28 04:26:19</small></p></a></div><div class="comment-body post-content"><p>Hey Yanir, that&rsquo;s embarrassing. :)</p><p>Sorry we haven&rsquo;t gotten back to you yet. I do see you in our queue. Its been a busy two months so we&rsquo;re a bit backed up, but getting back on track in the next week or two. Certainly understand if that doesn&rsquo;t fit into your own timeline. Sorry if that ends up being the case.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2016/01/24/the-joys-of-offline-data-collection/index.html b/2016/01/24/the-joys-of-offline-data-collection/index.html
index 0d4bf57ad..57f2be0ae 100644
--- a/2016/01/24/the-joys-of-offline-data-collection/index.html
+++ b/2016/01/24/the-joys-of-offline-data-collection/index.html
@@ -18,7 +18,7 @@
 https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/rls-survey-counts.png 777w," src=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/rls-survey-counts.png alt="RLS surveys by Australian financial year (July-June)" loading=lazy></a><figcaption><p>RLS surveys by Australian financial year (July-June). Source: <a href=http://reeflifesurvey.com/wp-content/uploads/2015/11/RLSF_AnnualReport_2015_FINAL_301115.pdf target=_blank rel=noopener>RLS Foundation Annual Report 2015</a></p></figcaption></figure><p>Each RLS survey requires several hours of work. In addition to performing the survey itself, a lot of work goes into entering the data and verifying its quality. Getting to the survey sites is not always a trivial task, especially for remote sites such as some of those we dived on my recent trip. Spending a month diving the Great Barrier Reef is a good way of appreciating its greatness. As the map shows, the surveys we did covered only the top part of the reef&rsquo;s 2300 kilometres, and we only sampled a few sites within that part. The Great Barrier Reef is very vast, and it is hard to convey its vastness with just words or a map. You have to be there to understand – it is quite humbling.</p><div style=text-align:center><iframe src="https://www.google.com/maps/d/embed?mid=1mpBSDQyF12FUh497KuFfYa0lHIk" width=640 height=480></iframe></div><p>In summary, the RLS experience has given me a new appreciation for small data in the offline world. Offline data collection is often expensive and labour-intensive – you need to work hard to produce a few high-quality data points. But the size of your data doesn&rsquo;t matter (though having more quality data is always good). What really matters is what you do with the data – and the RLS team and their collaborators have been doing quite a lot. The RLS experience also illustrates the importance of domain expertise: I&rsquo;ve looked at the RLS datasets, but I have no idea what questions are worth asking and answering using those datasets. The RLS project is yet another example of how in science <a href=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/>collecting data is time-consuming, and coming up with appropriate research questions is hard</a>. It is a lot of fun, though.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/scuba-diving/>Scuba Diving</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on x" href="https://x.com/intent/tweet/?text=The%20joys%20of%20offline%20data%20collection&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f&amp;hashtags=datascience%2cdeeplearning%2cenvironment%2cmarinescience%2cpersonal%2cpredictivemodelling%2cReefLifeSurvey%2cscubadiving"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f&amp;title=The%20joys%20of%20offline%20data%20collection&amp;summary=The%20joys%20of%20offline%20data%20collection&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f&title=The%20joys%20of%20offline%20data%20collection"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on whatsapp" href="https://api.whatsapp.com/send?text=The%20joys%20of%20offline%20data%20collection%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on telegram" href="https://telegram.me/share/url?text=The%20joys%20of%20offline%20data%20collection&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The joys of offline data collection on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20joys%20of%20offline%20data%20collection&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f01%2f24%2fthe-joys-of-offline-data-collection%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/index.html b/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/index.html
index 7802746e9..94ba050e2 100644
--- a/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/index.html
+++ b/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/us-science-spending-versus-suicides_hucb19a666efd495d868358d2e56c5c43f_82139_1500x0_resize_box_3.png 1500w," src=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/us-science-spending-versus-suicides_hucb19a666efd495d868358d2e56c5c43f_82139_800x0_resize_box_3.png alt="US science spending versus suicides" loading=lazy></a><figcaption><p>Source: <a href=http://www.tylervigen.com/spurious-correlations target=_blank rel=noopener>Spurious Correlations by Tyler Vigen</a></p></figcaption></figure><p>Causal analysis aims to identify factors that are independent of spurious correlations, allowing stakeholders to make well-informed decisions. It is all about <a href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/>getting to the top of the DIKW (data-information-knowledge-wisdom) pyramid</a> by understanding <strong>why</strong> things happen and what we can do to change the world. However, finding true causes can be very hard, especially in cases where you can&rsquo;t perform experiments. <a href=http://ftp.cs.ucla.edu/pub/stat_ser/r391.pdf target=_blank rel=noopener>Judea Pearl explains it well</a>:</p><blockquote><p>We know, from first principles, that any causal conclusion drawn from observational studies must rest on untested causal assumptions. Cartwright (1989) named this principle ‘no causes in, no causes out,&rsquo; which follows formally from the theory of equivalent models (Verma and Pearl, 1990); for any model yielding a conclusion C, one can construct a statistically equivalent model that refutes C and fits the data equally well.</p></blockquote><p>What this means in practice is that you can&rsquo;t, for example, conclusively prove that smoking causes cancer without making some reasonable assumptions about the mechanisms at play. For ethical reasons, we can&rsquo;t perform a randomly controlled trial where a test group is forced to smoke for years while a control group is forced not to smoke. Therefore, our conclusions about the causal link between smoking and cancer are drawn from observational studies and an understanding of the mechanisms by which various cancers develop (e.g., the effect of cigarette smoke on individual cells can be studied without forcing people to smoke). <del>Cancer</del> Tobacco companies have exploited this fact for years, making the claim that the probability of both cancer and smoking is raised by some mysterious genetic factors. Fossil fuel and food companies use similar arguments to sell their products and block attempts to regulate their industries (as discussed in previous posts on <a href=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/>the hardest parts of data science</a> and <a href=https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/>nutritionism</a>). Fighting against such arguments is an uphill battle, as it is easy to sow doubt with a few simplistic catchphrases, while proving and communicating causality to laypeople is much harder (or <a href=http://www.sciencealert.com/new-study-links-climate-change-denials-with-conspiracy-theories target=_blank rel=noopener>impossible when it comes to deeply-held irrational beliefs</a>).</p><h2 id=my-causality-journey-is-just-beginning>My causality journey is just beginning<a hidden class=anchor aria-hidden=true href=#my-causality-journey-is-just-beginning>#</a></h2><p>My interest in formal causal analysis was seeded a couple of years ago, with a reading group that was dedicated to Judea Pearl&rsquo;s work. We didn&rsquo;t get very far, as I was a bit disappointed with what causal calculus can and cannot do. This may have been because I didn&rsquo;t come in with the right expectations – I expected a black box that automatically finds causes. Recently reading Samantha Kleinberg&rsquo;s excellent book <a href=http://www.skleinberg.org/why/ target=_blank rel=noopener>Why: A Guide to Finding and Using Causes</a> has made my expectations somewhat more realistic:</p><blockquote><p>Thousands of years after Aristotle&rsquo;s seminal work on causality, hundreds of years after Hume gave us two definitions of it, and decades after automated inference became a possibility through powerful new computers, causality is still an unsolved problem. Humans are prone to seeing causality where it does not exist and our algorithms aren&rsquo;t foolproof. Even worse, once we find a cause it&rsquo;s still hard to use this information to prevent or produce an outcome because of limits on what information we can collect and how we can understand it. After looking at all the cases where methods haven’t worked and researchers and policy makers have gotten causality really wrong, you might wonder why you should bother.</p><p>[&mldr;]</p><p>Rather than giving up on causality, what we need to give up on is the idea of having a black box that takes some data straight from its source and emits a stream of causes with no need for interpretation or human intervention. Causal inference is necessary and possible, but it is not perfect and, most importantly, it requires domain knowledge.</p></blockquote><p>Kleinberg&rsquo;s book is a great general intro to causality, but it intentionally omits the mathematical details behind the various methods. I am now ready to once again go deeper into causality, perhaps starting with Kleinberg&rsquo;s more technical book, <a href=http://www.skleinberg.org/causality_book/index.html target=_blank rel=noopener>Causality, Probability, and Time</a>. Other recommendations are very welcome!</p><p style=font-size:80%><i>Cover image source: <a href=https://xkcd.com/552/ target=_blank rel=noopener>xkcd: Correlation</a></i></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/insights/>Insights</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on x" href="https://x.com/intent/tweet/?text=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f&amp;hashtags=analytics%2ccausalinference%2cdatascience%2cdeeplearning%2cinsights%2cmachinelearning%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f&amp;title=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead&amp;summary=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f&title=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on whatsapp" href="https://api.whatsapp.com/send?text=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on telegram" href="https://telegram.me/share/url?text=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Why you should stop worrying about deep learning and deepen your understanding of causality instead on ycombinator" href="https://news.ycombinator.com/submitlink?t=Why%20you%20should%20stop%20worrying%20about%20deep%20learning%20and%20deepen%20your%20understanding%20of%20causality%20instead&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f02%2f14%2fwhy-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1073><div class=comment-header><a href=#comment-1073><img class=comment-avatar src="https://www.gravatar.com/avatar/831fe3b72cff0094cf7287a9a22f4ee2?s=50"><p class=comment-info><strong>chewiebeans</strong><br><small>2016-02-15 01:03:41</small></p></a></div><div class="comment-body post-content">It seems to me that causality is another of our thought conveniences, just one more attempt at linearising our frustratingly non-linear existence, akin to teaching with Newtonian physics, segue to Einstein’s relativistic mechanics when the kids are ready (if ever). Cyclic systems can self-perpetuate in non-repeating cycles (chaos theory) but also respond with or resist change arising from external inputs. I believe when people speak of causality, what they are really thinking about (and desiring) is a conversation around stability versus volatility.</div></div><div class=comment-level-1 id=comment-1078><div class=comment-header><a href=#comment-1078><img class=comment-avatar src="https://www.gravatar.com/avatar/cd2a2b049bc7e4b0f79965e1f54ba25c?s=50"><p class=comment-info><strong>willw9</strong><br><small>2016-02-15 15:31:12</small></p></a></div><div class="comment-body post-content">Great comment.</div></div><div class=comment-level-0 id=comment-1075><div class=comment-header><a href=#comment-1075><img class=comment-avatar src="https://www.gravatar.com/avatar/3a974c8365594f5920b6b20ac70c9da9?s=50"><p class=comment-info><strong>Jim Savage</strong><br><small>2016-02-15 03:04:50</small></p></a></div><div class="comment-body post-content"><p>Hey Yanir - great post.</p><p>If you&rsquo;ve not already, you should read Mostly Harmless Econometrics. They take quite a different approach to causality than Pearl (though there is a lot of conceptual overlap). It definitely helps build intuition for the topic. It&rsquo;s also worth reading the relevant mid-70s papers from Rubin.</p></div></div><div class=comment-level-1 id=comment-1076><div class=comment-header><a href=#comment-1076><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-15 08:20:06</small></p></a></div><div class="comment-body post-content">Thanks for the pointers, Jim! I&rsquo;ll check those resources out.</div></div><div class=comment-level-1 id=comment-1257><div class=comment-header><a href=#comment-1257><img class=comment-avatar src="https://www.gravatar.com/avatar/76e0485a85f7f65d1ce164fc79a2c888?s=50"><p class=comment-info><strong>Will Lowe (@conjugateprior)</strong><br><small>2016-08-02 02:32:44</small></p></a></div><div class="comment-body post-content">It&rsquo;s not a different approach. The notation is different but the two frameworks (Pearl&rsquo;s and Neyman-Rubin) have been proved equivalent.</div></div><div class=comment-level-0 id=comment-1077><div class=comment-header><a href=#comment-1077><img class=comment-avatar src="https://www.gravatar.com/avatar/79adc945afb3363bd9275758c06a14a6?s=50"><p class=comment-info><strong>M Edward/Ed Borasky (@znmeb)</strong><br><small>2016-02-15 10:01:19</small></p></a></div><div class="comment-body post-content">I took a look at the Amazon sample for Causality, Probability and Time but I doubt if I&rsquo;ll buy it just yet. I&rsquo;ve got Judea Pearl&rsquo;s Probabilistic Reasoning in Intelligent Systems already and think I want to work through that in a programming language (R is my first choice) before buying any more books. ;-)</div></div><div class=comment-level-0 id=comment-1079><div class=comment-header><a href=#comment-1079><img class=comment-avatar src="https://www.gravatar.com/avatar/7e01dcd2246f3af62589e9b3c70cffe9?s=50"><p class=comment-info><strong>Joanna</strong><br><small>2016-02-15 16:01:01</small></p></a></div><div class="comment-body post-content">I appreciate this post. I teach General Psychology, and this is a central issue that I present to my students. In the meantime, I regularly come across articles, in peer-reviewed as wells as mainstream publications, which discuss correlational data as if it were supporting a causal relationship. As I tell my students, one of the difficulties is the use of the word &ldquo;factor&rdquo; in both types of discussions. In correlation, factors are pieces of information which give you a more likely guess about an unknown piece of information. In causation, factors are things that contribute to something else existing. Both concepts feed the mind&rsquo;s desire to find patterns in the relevant world which inform our decisions/behaviors so that we can continue living, hopefully in a pleasant state. We are often tricked by these patterns (illusions, etc.), but most of the time they pan out in a beneficial way. Making the leap from &ldquo;this is how things tend to work in my immediate experience&rdquo; to &ldquo;this is how things work everywhere for everyone&rdquo; is where theories are born, where science lives, and where we often make mistakes along the way. Proceed with caution from observation to theory, but by all means, proceed!</div></div><div class=comment-level-0 id=comment-1082><div class=comment-header><a href=#comment-1082><img class=comment-avatar src="https://www.gravatar.com/avatar/123acf099145a4a605d9ac2f65ecd277?s=50"><p class=comment-info><strong>Kyle Gagnon</strong><br><small>2016-02-16 01:33:53</small></p></a></div><div class="comment-body post-content">Really great read! This is something many of my colleagues have discussed in the past. Here&rsquo;s an article that might help us get closer to causality with observational data: <a href=http://goo.gl/MP7WQo target=_blank rel=noopener>http://goo.gl/MP7WQo</a>, and here is a video about it: <a href="https://www.youtube.com/watch?v=uhONGgfx8Do" target=_blank rel=noopener>https://www.youtube.com/watch?v=uhONGgfx8Do</a></div></div><div class=comment-level-0 id=comment-1087><div class=comment-header><a href=#comment-1087><img class=comment-avatar src="https://www.gravatar.com/avatar/22d41e5b6ff197cd7900c0514d1bd305?s=50"><p class=comment-info><strong>Boris Gorelik</strong><br><small>2016-02-18 09:56:34</small></p></a></div><div class="comment-body post-content">The problem with the search for causality (or, more generally, explainability) is that in many cases, it is &ldquo;not interesting&rdquo;. If I click on Google search results, neither me nor Google algorithm developers are truly interested how the algorithm decided to rank Page A before Page B. It is OK for me, as an end user, not to care about those details, as much as I don&rsquo;t care hydraulics every time I take a shower. Is it OK for me, as a data scientist, not to care about the reasons behind my models? Honestly, I don&rsquo;t yet know.</div></div><div class=comment-level-1 id=comment-1093><div class=comment-header><a href=#comment-1093><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-18 20:42:41</small></p></a></div><div class="comment-body post-content"><p>I agree that in many cases the reasoning behind models isn&rsquo;t interesting, as long as the models produce satisfactory results. Web search is actually a good example. Yes, many end users don&rsquo;t really care how Google ranks pages, but SEO practitioners go to great lengths to understand search algorithms and get pages to rank well (see <a href=https://moz.com/search-ranking-factors target=_blank rel=noopener>https://moz.com/search-ranking-factors</a> for example).</p><p>As data scientists, it&rsquo;s important to consider model stability in production. Sculley et al. said it well in their paper on machine learning technical debt (<a href=http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf%29 target=_blank rel=noopener>http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf)</a>: &ldquo;Machine learning systems often have a difficult time distinguishing the impact of correlated features. This may not seem like a major problem: if two features are always correlated, but only one is truly causal, it may still seem okay to ascribe credit to both and rely on their observed co-occurrence. However, if the world suddenly stops making these features co-occur, prediction behavior may change significantly.&rdquo;</p><p>Finally, in many cases what we really care about is interventionality. I don&rsquo;t think it&rsquo;s a real word, but what it means is that you don&rsquo;t really care whether A causes B, you want to know whether intervening to change A would change B. These inferences are critical in fields like medicine and marketing, but we can look at an example from the world of blogging, which is probably more relevant to you. Many bloggers would like to attract more readers. A possible costly intervention would be to switch platforms from WordPress to Medium. Cheaper interventions may be changing the site&rsquo;s layout, writing titles that get people interested, and posting links to your content on relevant channels. Another intervention would be trying to post at different times (as implied by WordPress insights and discussed in <a href=http://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/%29>http://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/)</a>. Obviously, one would like to apply the interventions with the highest return on investment first, and data that helps with ranking the interventions is very interesting.</p></div></div><div class=comment-level-0 id=comment-1088><div class=comment-header><a href=#comment-1088><img class=comment-avatar src="https://www.gravatar.com/avatar/efedc95784b33ec9dbc8ae53e5a5cbd7?s=50"><p class=comment-info><strong>Greg Gandenberger</strong><br><small>2016-02-18 15:53:34</small></p></a></div><div class="comment-body post-content">James Woodward&rsquo;s <em>Making Things Happen</em> gives a fantastic, relatively non-technical analysis of causation that fits well with Pearl&rsquo;s approach.</div></div><div class=comment-level-1 id=comment-1089><div class=comment-header><a href=#comment-1089><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-18 18:56:04</small></p></a></div><div class="comment-body post-content">Thanks! I&rsquo;ll check it out.</div></div><div class=comment-level-0 id=comment-1090><div class=comment-header><a href=#comment-1090><img class=comment-avatar src="https://www.gravatar.com/avatar/5ae7cb133b8f140346083a4885fa1569?s=50"><p class=comment-info><strong>Joel Kreager</strong><br><small>2016-02-18 18:58:51</small></p></a></div><div class="comment-body post-content">I&rsquo;ve been thinking about this lately quite a bit. The fact that I can type this comment and send it across the internet rests on the ability to create a completely controlled causal environment. Inside the computer, all noise and randomness is kept below the threshold of the data, and every process is completely causal. Meanwhile, outside the computer, most measurements are mostly noise, and extracting any sort of causal relation is very difficult and often impossible. My mind seems to have some sort of idea of cause as something like the interaction of balls on a pool table. The que ball strikes the eight ball and knocks it into the corner pocket, etc. But when one tries to measure things, mostly one finds nothing like this. Instead, one finds that some measurements tend to be found with other measurements most of the time, but not all of the time. Cause thus seems a statistical thing, and in no way absolute. I have difficulty reconciling the two views. One thing that occured to me to investigate, was the manner in which several huge internet outages developed involving the BGP protocol. It seemed to me that every individual packet must experience a completely causal path, but the aggregate turns into the statistical causal form we most usually deal with. I haven&rsquo;t followed up with this idea so far, however</div></div><div class=comment-level-1 id=comment-1099><div class=comment-header><a href=#comment-1099><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-20 09:41:25</small></p></a></div><div class="comment-body post-content">Interesting. I think that one of the dividing factors between traditional software engineering and data science is the attitude towards uncertainty. Whereas, as you say, coding is all about creating a controlled deterministic environment, data science and statistics thrive on uncertainty. It&rsquo;s similar with computer networks as well, where there is always a non-deterministic element (e.g., packets may be lost, arrive out-of-order, or come in bursts).</div></div><div class=comment-level-0 id=comment-1091><div class=comment-header><a href=#comment-1091><img class=comment-avatar src="https://www.gravatar.com/avatar/efedc95784b33ec9dbc8ae53e5a5cbd7?s=50"><p class=comment-info><strong>Greg Gandenberger</strong><br><small>2016-02-18 20:02:05</small></p></a></div><div class="comment-body post-content"><p><em>Causation, Prediction, and Search</em> is also seminal (<a href=https://www.cs.cmu.edu/afs/cs.cmu.edu/project/learn-43/lib/photoz/.g/scottd/fullbook.pdf%29 target=_blank rel=noopener>https://www.cs.cmu.edu/afs/cs.cmu.edu/project/learn-43/lib/photoz/.g/scottd/fullbook.pdf)</a>.</p><p>Disclosure: I did my PhD just down the street from the authors of <em>Causation, Prediction, and Search</em>, and Woodward was on my thesis committee.</p></div></div><div class=comment-level-0 id=comment-1092><div class=comment-header><a href=#comment-1092><img class=comment-avatar src="https://www.gravatar.com/avatar/efedc95784b33ec9dbc8ae53e5a5cbd7?s=50"><p class=comment-info><strong>Greg Gandenberger</strong><br><small>2016-02-18 20:12:58</small></p></a></div><div class="comment-body post-content"><p>There is a subtle difference between Woodward&rsquo;s approach and that of Pearl and of Spirtes et al., which Glymour discusses in the following places:</p><p><a href=https://www.ncbi.nlm.nih.gov/pubmed/24887161 target=_blank rel=noopener>https://www.ncbi.nlm.nih.gov/pubmed/24887161</a>
 <a href="http://repository.cmu.edu/cgi/viewcontent.cgi?article=1280&amp;amp;context=philosophy" target=_blank rel=noopener>http://repository.cmu.edu/cgi/viewcontent.cgi?article=1280&amp;context=philosophy</a></p><p>Basically, Woodward starts with the notion of an intervention on a variable and defines other concepts (e.g. direct cause) in terms of it, whereas Pearl and Spirtes et al. start with the notion of direct cause. One consequence of this difference is that properties like sex and race that cannot be intervened upon in a straightforward way cannot be causes for Woodward, strictly speaking, but can be for Pearl and Spirtes et al. This is a fine point, however, and it&rsquo;s very nearly true that they simply provide alternative formulations of the same theory, with Woodward focusing on conceptual issues and the others focus on methodology.</p></div></div><div class=comment-level-1 id=comment-1098><div class=comment-header><a href=#comment-1098><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-20 09:26:23</small></p></a></div><div class="comment-body post-content">Thanks for all the pointers, Greg! I&rsquo;ll definitely check them out. Personally, I have a slight bias towards Pearl, as he is my academic grandfather (he was my advisor&rsquo;s advisor), but I&rsquo;m keen on learning as much as possible on all the different approaches to causality. It is a fascinating area!</div></div><div class=comment-level-0 id=comment-1095><div class=comment-header><a href=#comment-1095><img class=comment-avatar src="https://www.gravatar.com/avatar/3c84c9b8370a1242028b7f5f8cbb21b0?s=50"><p class=comment-info><strong>jasonhand24</strong><br><small>2016-02-19 15:12:30</small></p></a></div><div class="comment-body post-content">&ldquo;Thinking, Fast & Slow&rdquo; touches on some of this in later chapters. Some algebra is used to help illustrate the deception causality and efforts towards finding it can cause.</div></div><div class=comment-level-1 id=comment-1097><div class=comment-header><a href=#comment-1097><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-02-19 21:44:31</small></p></a></div><div class="comment-body post-content">Thanks! That book has been on my to-read list for a while now.</div></div><div class=comment-level-0 id=comment-1111><div class=comment-header><a href=#comment-1111><img class=comment-avatar src="https://www.gravatar.com/avatar/7d959875c747233a7578b7830e9ab384?s=50"><p class=comment-info><strong>jozvison</strong><br><small>2016-02-25 04:59:57</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=https://jozvison.wordpress.com/2016/02/25/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/ rel=nofollow>jozvison</a>.</div></div><div class=comment-level-0 id=comment-1112><div class=comment-header><a href=#comment-1112><img class=comment-avatar src="https://www.gravatar.com/avatar/1dc04229b204b5bd03fef0b7c0561e42?s=50"><p class=comment-info><strong>Jerome</strong><br><small>2016-02-25 17:28:47</small></p></a></div><div class="comment-body post-content">Great Post, Thanks Yanir.
 I have been afraid of being lost in Big data ie swarmed by such a vast amount of correlations.
diff --git a/2016/03/20/the-rise-of-greedy-robots/index.html b/2016/03/20/the-rise-of-greedy-robots/index.html
index b30d124c6..ab61bdd8e 100644
--- a/2016/03/20/the-rise-of-greedy-robots/index.html
+++ b/2016/03/20/the-rise-of-greedy-robots/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,deep learning,economics,futurism,machine intelligence"><meta name=description content="Is artificial/machine intelligence a future threat? I argue that it&rsquo;s already here, with greedy robots already dominating our lives."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The rise of greedy robots"><meta property="og:description" content="Is artificial/machine intelligence a future threat? I argue that it&rsquo;s already here, with greedy robots already dominating our lives."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/"><meta property="og:image" content="https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2016-03-20T20:33:43+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot.jpg"><meta name=twitter:title content="The rise of greedy robots"><meta name=twitter:description content="Is artificial/machine intelligence a future threat? I argue that it&rsquo;s already here, with greedy robots already dominating our lives."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"The rise of greedy robots","item":"https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The rise of greedy robots","name":"The rise of greedy robots","description":"Is artificial/machine intelligence a future threat? I argue that it\u0026rsquo;s already here, with greedy robots already dominating our lives.","keywords":["data science","deep learning","economics","futurism","machine intelligence"],"articleBody":"Given the impressive advancement of machine intelligence in recent years, many people have been speculating on what the future holds when it comes to the power and roles of robots in our society. Some have even called for regulation of machine intelligence before it’s too late. My take on this issue is that there is no need to speculate – machine intelligence is already here, with greedy robots already dominating our lives.\nMachine intelligence or artificial intelligence? The problem with talking about artificial intelligence is that it creates an inflated expectation of machines that would be completely human-like – we won’t have true artificial intelligence until we can create machines that are indistinguishable from humans. While the goal of mimicking human intelligence is certainly interesting, it is clear that we are very far from achieving it. We currently can’t even fully simulate C. elegans, a 1mm worm with 302 neurons. However, we do have machines that can perform tasks that require intelligence, where intelligence is defined as the ability to learn or understand things or to deal with new or difficult situations. Unlike artificial intelligence, there is no doubt that machine intelligence already exists.\nAirplanes provide a famous example: we don’t commonly think of them as performing artificial flight – they are machines that fly faster than any bird. Likewise, computers are super-intelligent machines. They can perform calculations that humans can’t, store and recall enormous amounts of information, translate text, play Go, drive cars, and much more – all without requiring rest or food. The robots are here, and they are becoming increasingly useful and powerful.\nWho are those greedy robots? Greed is defined as a selfish desire to have more of something (especially money). It is generally seen as a negative trait in humans. However, we have been cultivating an environment where greedy entities – for-profit organisations – thrive. The primary goal of for-profit organisations is to generate profit for their shareholders. If these organisations were human, they would be seen as the embodiment of greed, as they are focused on making money and little else. Greedy organisations “live” among us and have been enjoying a plethora of legal rights and protections for hundreds of years. These entities, which were formed and shaped by humans, now form and shape human lives.\nHumans running for-profit organisations have little choice but to play by their rules. For example, many people acknowledge that corporate tax avoidance is morally wrong, as revenue from taxes supports the infrastructure and society that enable corporate profits. However, any executive of a public company who refuses to do everything they legally can to minimise their tax bill is likely to lose their job. Despite being separate from the greedy organisations we run, humans have to act greedily to effectively serve their employers.\nThe relationship between greedy organisations and greedy robots is clear. Much of the funding that goes into machine intelligence research comes from for-profit organisations, with the end goal of producing profit for these entities. In the words of Jeffrey Hammerbacher: The best minds of my generation are thinking about how to make people click ads. Hammerbacher, an early Facebook employee, was referring to Facebook’s business model, where considerable resources are dedicated to getting people to engage with advertising – the main driver of Facebook’s revenue. Indeed, Facebook has hired Yann LeCun (a prominent machine intelligence researcher) to head its artificial intelligence research efforts. While LeCun’s appointment will undoubtedly result in general research advancements, Facebook’s motivation is clear – they see machine intelligence as a key driver of future profits. They, and other companies, use machine intelligence to build greedy robots, whose sole goal is to increase profits.\nGreedy robots are all around us. Advertising-driven companies like Facebook and Google use sophisticated algorithms to get people to click on ads. Retail companies like Amazon use machine intelligence to mine through people’s shopping history and generate product recommendations. Banks and mutual funds utilise algorithmic trading to drive their investments. None of this is science fiction, and it doesn’t take much of a leap to imagine a world where greedy robots are even more dominant. Just like we have allowed greedy legal entities to dominate our world and shape our lives, we are allowing greedy robots to do the same, just more efficiently and pervasively.\nWill robots take your job? The growing range of machine intelligence capabilities gives rise to the question of whether robots are going to take over human jobs. One salient example is that of self-driving cars, that are projected to render millions of professional drivers obsolete in the next few decades. The potential impact of machine intelligence on jobs was summarised very well by CGP Grey in his video Humans Need Not Apply. The main message of the video is that machines will soon be able to perform any job better or more cost-effectively than any human, thereby making humans unemployable for economic reasons. The video ends with a call to society to consider how to deal with a future where there are simply no jobs for a large part of the population.\nDespite all the technological advancements since the start of the industrial revolution, the prevailing mode of wealth distribution remains paid labour, i.e., jobs. The implication of this is that much of the work we do is unnecessary or harmful – people work because they have no other option, but their work doesn’t necessarily benefit society. This isn’t a new insight, as the following quotes demonstrate:\n“Most men appear never to have considered what a house is, and are actually though needlessly poor all their lives because they think that they must have such a one as their neighbors have. […] For more than five years I maintained myself thus solely by the labor of my hands, and I found that, by working about six weeks in a year, I could meet all the expenses of living.” – Henry David Thoreau, Walden (1854) “I think that there is far too much work done in the world, that immense harm is caused by the belief that work is virtuous, and that what needs to be preached in modern industrial countries is quite different from what always has been preached. […] Modern technique has made it possible to diminish enormously the amount of labor required to secure the necessaries of life for everyone. […] If, at the end of the war, the scientific organization, which had been created in order to liberate men for fighting and munition work, had been preserved, and the hours of the week had been cut down to four, all would have been well. Instead of that the old chaos was restored, those whose work was demanded were made to work long hours, and the rest were left to starve as unemployed.” – Bertrand Russell, In Praise of Idleness (1932) “In the year 1930, John Maynard Keynes predicted that technology would have advanced sufficiently by century’s end that countries like Great Britain or the United States would achieve a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.” – David Graeber, On the Phenomenon of Bullshit Jobs (2013) This leads to the conclusion that we are unlikely to experience the utopian future in which intelligent machines do all our work, leaving us ample time for leisure. Yes, people will lose their jobs. But it is not unlikely that new unnecessary jobs will be invented to keep people busy, or worse, many people will simply be unemployed and will not get to enjoy the wealth provided by technology. Stephen Hawking summarised it well recently:\nIf machines produce everything we need, the outcome will depend on how things are distributed. Everyone can enjoy a life of luxurious leisure if the machine-produced wealth is shared, or most people can end up miserably poor if the machine-owners successfully lobby against wealth redistribution. So far, the trend seems to be toward the second option, with technology driving ever-increasing inequality.\nWhere to from here? Many people believe that the existence of powerful greedy entities is good for society. Indeed, there is no doubt that we owe many beneficial technological breakthroughs to competition between for-profit companies. However, a single-minded focus on profit means that in many cases companies do what they can to reduce their responsibility for harmful side-effects of their activities. Examples include environmental pollution, multinational tax evasion, and health effects of products like tobacco and junk food. As history shows us, in truly unregulated markets, companies would happily utilise slavery and child labour to reduce their costs. Clearly, some regulation of greedy entities is required to obtain the best results for society.\nWith machine intelligence becoming increasingly powerful every day, some people think that to produce the best outcomes, we just need to wait for robots to be intelligent enough to completely run our lives. However, as anyone who has actually built intelligent systems knows, the outputs of such systems are strongly dependent on the inputs and goals set by system designers. Machine intelligence is just a tool – a very powerful tool. Like nuclear energy, we can use it to improve our lives, or we can use it to obliterate everything around us. The collective choice is ours to make, but is far from simple.\n","wordCount":"1644","inLanguage":"en","image":"https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot.jpg","datePublished":"2016-03-20T20:33:43Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">The rise of greedy robots</h1><div class=post-meta><span title='2016-03-20 20:33:43 +0000 UTC'>March 20, 2016</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot_hu400343414979e1c2dc8bafadfe0b6d4d_563587_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot_hu400343414979e1c2dc8bafadfe0b6d4d_563587_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot_hu400343414979e1c2dc8bafadfe0b6d4d_563587_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot_hu400343414979e1c2dc8bafadfe0b6d4d_563587_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot_hu400343414979e1c2dc8bafadfe0b6d4d_563587_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot.jpg 1920w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/greedy-robot.jpg alt width=1920 height=1064></figure><div class=post-content><p>Given the impressive advancement of machine intelligence in recent years, many people have been speculating on what the future holds when it comes to the power and roles of robots in our society. Some have even <a href=http://www.theguardian.com/technology/2014/oct/27/elon-musk-artificial-intelligence-ai-biggest-existential-threat target=_blank rel=noopener>called for regulation of machine intelligence before it&rsquo;s too late</a>. My take on this issue is that there is no need to speculate – machine intelligence is already here, with greedy robots already dominating our lives.</p><h2 id=machine-intelligence-or-artificial-intelligence>Machine intelligence or artificial intelligence?<a hidden class=anchor aria-hidden=true href=#machine-intelligence-or-artificial-intelligence>#</a></h2><p>The problem with talking about <em>artificial</em> intelligence is that it creates an inflated expectation of machines that would be completely human-like – we won&rsquo;t have true artificial intelligence until we can create machines that are indistinguishable from humans. While the goal of mimicking human intelligence is certainly interesting, it is clear that we are very far from achieving it. We currently <a href=http://www.openworm.org/ target=_blank rel=noopener>can&rsquo;t even fully simulate C. elegans, a 1mm worm with 302 neurons</a>. However, we do have machines that can perform tasks that require intelligence, where intelligence is defined as <a href=http://www.merriam-webster.com/dictionary/intelligence target=_blank rel=noopener>the ability to learn or understand things or to deal with new or difficult situations</a>. Unlike artificial intelligence, there is no doubt that <em>machine</em> intelligence already exists.</p><p>Airplanes provide a famous example: we don&rsquo;t commonly think of them as performing artificial flight – they are machines that fly faster than any bird. Likewise, computers are super-intelligent machines. They can perform calculations that humans can&rsquo;t, store and recall enormous amounts of information, translate text, play Go, drive cars, and much more – all without requiring rest or food. The robots are here, and they are becoming increasingly useful and powerful.</p><h2 id=who-are-those-greedy-robots>Who are those greedy robots?<a hidden class=anchor aria-hidden=true href=#who-are-those-greedy-robots>#</a></h2><p>Greed is defined as <a href=http://www.merriam-webster.com/dictionary/greed target=_blank rel=noopener>a selfish desire to have more of something (especially money)</a>. It is generally seen as a negative trait in humans. However, we have been cultivating an environment where greedy entities – for-profit organisations – thrive. The primary goal of for-profit organisations is to generate profit for their shareholders. If these organisations were human, they would be seen as the embodiment of greed, as they are focused on making money and little else. Greedy organisations &ldquo;live&rdquo; among us and have been enjoying a plethora of legal rights and protections for hundreds of years. These entities, which were formed and shaped by humans, now form and shape human lives.</p><p>Humans running for-profit organisations have little choice but to play by their rules. For example, many people acknowledge that corporate tax avoidance is morally wrong, as revenue from taxes supports the infrastructure and society that enable corporate profits. However, any executive of a public company who refuses to do everything they legally can to minimise their tax bill is likely to lose their job. Despite being separate from the greedy organisations we run, humans have to act greedily to effectively serve their employers.</p><p>The relationship between greedy organisations and greedy robots is clear. Much of the funding that goes into machine intelligence research comes from for-profit organisations, with the end goal of producing profit for these entities. In the <a href=http://www.fastcompany.com/3008436/takeaway/why-data-god-jeffrey-hammerbacher-left-facebook-found-cloudera target=_blank rel=noopener>words of Jeffrey Hammerbacher</a>: <em>The best minds of my generation are thinking about how to make people click ads.</em> Hammerbacher, an early Facebook employee, was referring to Facebook&rsquo;s business model, where considerable resources are dedicated to getting people to engage with advertising – the main driver of Facebook&rsquo;s revenue. Indeed, Facebook has hired <a href=https://en.wikipedia.org/wiki/Yann_LeCun target=_blank rel=noopener>Yann LeCun</a> (a prominent machine intelligence researcher) to head its artificial intelligence research efforts. While LeCun&rsquo;s appointment will undoubtedly result in general research advancements, Facebook&rsquo;s motivation is clear – they see machine intelligence as a key driver of future profits. They, and other companies, use machine intelligence to build greedy robots, whose sole goal is to increase profits.</p><p>Greedy robots are all around us. Advertising-driven companies like Facebook and Google use sophisticated algorithms to get people to click on ads. Retail companies like Amazon use machine intelligence to mine through people&rsquo;s shopping history and generate product recommendations. Banks and mutual funds utilise algorithmic trading to drive their investments. None of this is science fiction, and it doesn&rsquo;t take much of a leap to imagine a world where greedy robots are even more dominant. Just like we have allowed greedy legal entities to dominate our world and shape our lives, we are allowing greedy robots to do the same, just more efficiently and pervasively.</p><h2 id=will-robots-take-your-job>Will robots take your job?<a hidden class=anchor aria-hidden=true href=#will-robots-take-your-job>#</a></h2><p>The growing range of machine intelligence capabilities gives rise to the question of whether robots are going to take over human jobs. One salient example is that of self-driving cars, that are projected to render millions of professional drivers obsolete in the next few decades. The potential impact of machine intelligence on jobs was summarised very well by CGP Grey in his video <a href="https://www.youtube.com/watch?v=7Pq-S557XQU" target=_blank rel=noopener>Humans Need Not Apply</a>. The main message of the video is that machines will soon be able to perform any job better or more cost-effectively than any human, thereby making humans unemployable for economic reasons. The video ends with a call to society to consider how to deal with a future where there are simply no jobs for a large part of the population.</p><p>Despite all the technological advancements since the start of the industrial revolution, the prevailing mode of wealth distribution remains paid labour, i.e., jobs. The implication of this is that much of the work we do is unnecessary or harmful – people work because they have no other option, but their work doesn&rsquo;t necessarily benefit society. This isn&rsquo;t a new insight, as the following quotes demonstrate:</p><ul><li><em>&ldquo;Most men appear never to have considered what a house is, and are actually though needlessly poor all their lives because they think that they must have such a one as their neighbors have. [&mldr;] For more than five years I maintained myself thus solely by the labor of my hands, and I found that, by working about six weeks in a year, I could meet all the expenses of living.&rdquo;</em> – Henry David Thoreau, <a href=http://www.gutenberg.org/files/205/205-h/205-h.htm target=_blank rel=noopener>Walden</a> (<strong>1854</strong>)</li><li><em>&ldquo;I think that there is far too much work done in the world, that immense harm is caused by the belief that work is virtuous, and that what needs to be preached in modern industrial countries is quite different from what always has been preached. [&mldr;] Modern technique has made it possible to diminish enormously the amount of labor required to secure the necessaries of life for everyone. [&mldr;] If, at the end of the war, the scientific organization, which had been created in order to liberate men for fighting and munition work, had been preserved, and the hours of the week had been cut down to four, all would have been well. Instead of that the old chaos was restored, those whose work was demanded were made to work long hours, and the rest were left to starve as unemployed.&rdquo;</em> – Bertrand Russell, <a href=http://www.zpub.com/notes/idle.html target=_blank rel=noopener>In Praise of Idleness</a> (<strong>1932</strong>)</li><li><em>&ldquo;In the year 1930, John Maynard Keynes predicted that technology would have advanced sufficiently by century&rsquo;s end that countries like Great Britain or the United States would achieve a 15-hour work week. There&rsquo;s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.&rdquo;</em> – David Graeber, <a href=http://strikemag.org/bullshit-jobs/ target=_blank rel=noopener>On the Phenomenon of Bullshit Jobs</a> (<strong>2013</strong>)</li></ul><p>This leads to the conclusion that we are unlikely to experience the utopian future in which intelligent machines do all our work, leaving us ample time for leisure. Yes, people will lose their jobs. But it is not unlikely that new unnecessary jobs will be invented to keep people busy, or worse, many people will simply be unemployed and will not get to enjoy the wealth provided by technology. Stephen Hawking <a href=https://www.reddit.com/r/science/comments/3nyn5i/science_ama_series_stephen_hawking_ama_answers/cvsdmkv target=_blank rel=noopener>summarised it well recently</a>:</p><blockquote><p>If machines produce everything we need, the outcome will depend on how things are distributed. Everyone can enjoy a life of luxurious leisure if the machine-produced wealth is shared, or most people can end up miserably poor if the machine-owners successfully lobby against wealth redistribution. So far, the trend seems to be toward the second option, with technology driving ever-increasing inequality.</p></blockquote><h2 id=where-to-from-here>Where to from here?<a hidden class=anchor aria-hidden=true href=#where-to-from-here>#</a></h2><p>Many people believe that the existence of powerful greedy entities is good for society. Indeed, there is no doubt that we owe many beneficial technological breakthroughs to competition between for-profit companies. However, a single-minded focus on profit means that in many cases companies do what they can to reduce their responsibility for harmful side-effects of their activities. Examples include environmental pollution, multinational tax evasion, and health effects of products like tobacco and junk food. As history shows us, in truly unregulated markets, companies would happily utilise slavery and child labour to reduce their costs. Clearly, some regulation of greedy entities is required to obtain the best results for society.</p><p>With machine intelligence becoming increasingly powerful every day, some people think that to produce the best outcomes, we just need to wait for robots to be intelligent enough to completely run our lives. However, as anyone who has actually built intelligent systems knows, the outputs of such systems are strongly dependent on the inputs and goals set by system designers. Machine intelligence is just a tool – a very powerful tool. Like nuclear energy, we can use it to improve our lives, or we can use it to obliterate everything around us. The collective choice is ours to make, but is far from simple.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/economics/>Economics</a></li><li><a href=https://yanirseroussi.com/tags/futurism/>Futurism</a></li><li><a href=https://yanirseroussi.com/tags/machine-intelligence/>Machine Intelligence</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on x" href="https://x.com/intent/tweet/?text=The%20rise%20of%20greedy%20robots&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f&amp;hashtags=datascience%2cdeeplearning%2ceconomics%2cfuturism%2cmachineintelligence"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f&amp;title=The%20rise%20of%20greedy%20robots&amp;summary=The%20rise%20of%20greedy%20robots&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f&title=The%20rise%20of%20greedy%20robots"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on whatsapp" href="https://api.whatsapp.com/send?text=The%20rise%20of%20greedy%20robots%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on telegram" href="https://telegram.me/share/url?text=The%20rise%20of%20greedy%20robots&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rise of greedy robots on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20rise%20of%20greedy%20robots&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f03%2f20%2fthe-rise-of-greedy-robots%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1286><div class=comment-header><a href=#comment-1286><img class=comment-avatar src="https://www.gravatar.com/avatar/9cfd18615668b761e91d0a1253061bf2?s=50"><p class=comment-info><strong>SoftwareAsLife (@SoftDevLife)</strong><br><small>2016-08-17 11:02:44</small></p></a></div><div class="comment-body post-content"><p>Yes, the world has always been greedy. This reminds me of Dijkstra greedy algorithm which is used to find the shortest route. There is a lot of &ldquo;steps&rdquo; for an organization to become profitable. Greediness tries to find the most cost-efficient way to achieve the goal of being profitable. Let us assume that each road is a railway and trains traverse to their destinations. Each decision path will sacrifice other trains waiting to cross to their destination. If human stupidity does not overrule again, our scarce resource ultimately will be constrained by economics to one element only: time. Where do we want humans to allocate spending their time on?</p><p>Greediness will always thrive in the sense it is seen as a trait of growth by society. War, which in today&rsquo;s society we ultimately condemn, was viewed in the past as one way for one nation to gain growth. Growth was limited to the domain of a specific country and the rest treated as an enemy. The end of the war was enforced by the right of not interfering other one&rsquo;s own property. People had to find other means to gain growth. Thus, the concept of greediness was enforced. Greediness is using emotional appeal to manipulate other&rsquo;s people habits to a specific domain.</p><p>The problem with greediness is whether people evaluate the emotional appeal matching to something positive or not positive to self and society. The maximum capacity of getting that right is:</p><ol><li>The effort of people on having a multi-disciplinary knowledge of multiple domains</li><li>The effort of people of using their knowledge to their daily decisions.</li></ol><p>Most consumers are passive on the above two points due to society constraints. More specifically, if people focus learning from other domains, they have the risk of underperforming on their main domain having a competitive disadvantage on their prospect of their career. This limited domain in bayesian terms makes people have low confidence for many topics leaving others to influence our decision making. I think that low confidence is the main causation we see a high rise trend where people&rsquo;s decisions rely more upon push messages instead of pull messages. I haven&rsquo;t seen to this day sophisticated push messages where the user has a choice of options what to see except in the on boarding phase of a product. In addition, the on boarding phase are only reasons why you should use me. There will never be a phase on reasons when not to use me. If a specific domain of a product could say to the user based on the personalized information it gathered: &ldquo;Hey, you shouldn&rsquo;t be using me in this situation. Use Bob instead, it will make your life easier&rdquo; (This will be possible with the evolution of data). The problem is a specific domain will never explain alternative domains that can solve a user&rsquo;s individual problem better because there is not a commission fee of recommending one user to another domain with qualitative information. This causes a domain which consists a set of employees to not have an interest in researching alternative domains that can solve a specific problem better because the current system has not placed a platform to reward it with a commission fee. Instead, the only way for a specific domain to thrive is by copying others ideas or owning them through acquisitions. This demotivates innovation in great sum. So far, it is only people with consciousness, with value or no value, such as start-up entrepreneurs that leave old positions and people who contribute in open source correspondingly, that go the extra mile to innovate. My whole hypothesis is that our natural instincts are a machine learner, and our only task is to do progress on everything, even our own personal life.</p><p>If those two points happen, the rule of greediness will be overruled. People will consciously evaluate whether that emotional appeal makes sense in the big picture because their jobs will force them to associate their domain with alternatives to gain a commission fee. That will gain them a more robust interdisciplinary domain knowledge causing them to have more confidence on pulling than pushing information of other domains they start to know about. Passive consumers will be less passive. The value was based before by war, now greediness, later it will be all about evaluation.</p><p>Your point of people doing less work emphasizes a more passive society than it already is. I do not propose that as that will make our situation worse. The problem is the type of tasks people do, not the task itself. People need to do tasks that progress our society instead of being passive like the game of civilization. It is the only way that makes us happy and has a purpose. Like we humans create machine learning instances have an end goal purpose, so we as humans are machine learners for a purpose where we can handle any situation that becomes a problem. Our starter pack was human suffering, hunger, and death to solve problems. Now it becomes less so and we have to be motivated by it beyond extrinsic rewards.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html b/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
index 84808a565..2e509ad6c 100644
--- a/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
+++ b/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/index.html
@@ -13,7 +13,7 @@
 https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/austin-bradford-hill.jpg 1131w," src=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/austin-bradford-hill_huf628c6f0f9f49f20b40d6213fb032a87_561648_800x0_resize_q75_box.jpg alt="Austin Bradford Hill" width=150 loading=lazy></a><figcaption><p>Austin Bradford Hill</p></figcaption></figure>Hill: Testing untested assumptions</h2><p>To the best of my knowledge, all causal inference methods rely on untested assumptions. Specifically, we can never include all the variables in the universe in our models. Therefore, any conclusions drawn are reliant on deciding what, when, and how to measure potential causes and effects. Another issue is that no matter how good and believable our modelling is, we cannot use causal inference to convince unreasonable people. For example, some people may cite divine intervention as an unmeasurable cause of anything and everything. In addition, <a href=https://en.wikipedia.org/wiki/Merchants_of_Doubt target=_blank rel=noopener>people with certain commercial interests</a> often try to raise doubt about well-established causal mechanisms by making unreasonable claims for evidence of various hidden factors. For example, tobacco companies used to claim that both smoking and lung cancer were caused by a common hidden factor, making the link between smoking and lung cancer a mere association.</p><p>Assuming that we are dealing with reasonable people, there&rsquo;s still the question of where we should get our untested assumptions from. This question is fairly old, and has been <a href=https://www.edwardtufte.com/tufte/hill target=_blank rel=noopener>partly answered in 1965 by Austin Bradford Hill</a>, with nine criteria that he recommended should be considered before calling an association causal:</p><ol><li><em>Strength</em>: How strong is the association? For example, lung cancer deaths of heavy smokers are 20-30 times greater than those of non-smokers.</li><li><em>Consistency</em>: Has the association been repeatedly observed in various circumstances? For example, many different populations have exhibited an association between smoking rates and cancer.</li><li><em>Specificity</em>: Can we pin down specific instances of the effect to specific instances of the cause? Hill sees this as a nice-to-have condition rather than a must-have – cases with multiple possible causes may not fulfil the specificity requirement.</li><li><em>Temporality</em>: Do we know that <em>c</em> leads to <em>e</em> or are we observing them together? This is a condition that isn&rsquo;t always easy to fulfil, especially when dealing with feedback loops and slow processes.</li><li><em>Biological gradient</em>: Hill&rsquo;s focus was on medicine, and this condition refers to the association exhibiting some dose-response curve. This can be generalised to other fields, as we can expect some regularity in the effect if it is a function of the cause (though it doesn&rsquo;t have to be a linear function).</li><li><em>Plausibility</em>: Do we know of a mechanism that can explain how the cause brings about the effect?</li><li><em>Coherence</em>: Does the association conflict with our current knowledge? Even if it does, it isn&rsquo;t enough to rule out causality, as our current knowledge may be incomplete or wrong.</li><li><em>Experiment</em>: If possible, running controlled experiments may yield very powerful evidence in favour of causation.</li><li><em>Analogy</em>: Do we know of any similar cause-and-effect relationships?</li></ol><p>Hill summarises the list of criteria (or viewpoints) with the following statements.</p><blockquote><p>Here then are nine different viewpoints from all of which we should study association before we cry causation. What I do not believe – and this has been suggested – is that we can usefully lay down some hard-and-fast rules of evidence that <em>must</em> be obeyed before we accept cause and effect. None of my nine viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required as a <em>sine qua non</em>. What they can do, with greater or less strength, is to help us to make up our minds on the fundamental question – is there any other way of explaining the set of facts before us, is there any other answer equally, or more, likely than cause and effect?</p><p>No formal tests of significance can answer those questions. Such tests can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those effects. Beyond that they contribute nothing to the ‘proof&rsquo; of our hypothesis.</p></blockquote><p>Hill then goes on to criticise the increased focus on statistical significance as a condition for accepting scientific papers for publication. Remembering that this was over 50 years ago, it is a bit worrying that it has taken so long for the statistical community to <a href=https://www.amstat.org/newsroom/pressreleases/P-ValueStatement.pdf target=_blank rel=noopener>formally acknowledge the fact that statistical significance does not imply scientific importance, or constitutes enough evidence to support a causal hypothesis</a>.</p><h2 id=closing-thoughts>Closing thoughts<a hidden class=anchor aria-hidden=true href=#closing-thoughts>#</a></h2><p>This post has only scratched the surface of the vast field of study of causality. At this point, I feel like I&rsquo;ve read quite a bit, and it is time to apply what I learned to real problems. I encounter questions of causality in my everyday work, but haven&rsquo;t fully applied formal causal inference to any problem yet. My view is that everyone needs to at least be aware of the need to consider causality, and of what it&rsquo;d take to truly prove causal impact. A large proportion of what many people need in practice may be addressed by Hill&rsquo;s criteria, rather than by formal methods for causal analysis. Nonetheless, I will report back when I get a chance to apply formal causal inference to real datasets. Stay tuned!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/insights/>Insights</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on x" href="https://x.com/intent/tweet/?text=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f&amp;hashtags=causalinference%2cdatascience%2cinsights%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f&amp;title=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions&amp;summary=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f&title=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on whatsapp" href="https://api.whatsapp.com/send?text=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on telegram" href="https://telegram.me/share/url?text=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions on ycombinator" href="https://news.ycombinator.com/submitlink?t=Diving%20deeper%20into%20causality%3a%20Pearl%2c%20Kleinberg%2c%20Hill%2c%20and%20untested%20assumptions&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f05%2f15%2fdiving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1198><div class=comment-header><a href=#comment-1198><img class=comment-avatar src="https://www.gravatar.com/avatar/3a974c8365594f5920b6b20ac70c9da9?s=50"><p class=comment-info><strong>James Savage</strong><br><small>2016-05-15 21:49:50</small></p></a></div><div class="comment-body post-content"><p>Interesting point on the causal significance. How does this work when you have confounders in x? I&rsquo;d have thought that x must contain the set of prima facie causes <em>for which we have true exogenous variation</em>.</p><p>Also, how does it work when you have bad controls in x (where x includes post-treatment causes that are plausibly varied by c)?</p></div></div><div class=comment-level-1 id=comment-1200><div class=comment-header><a href=#comment-1200><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-05-16 21:29:51</small></p></a></div><div class="comment-body post-content"><p>Good questions :)</p><p>To be honest, I&rsquo;m not completely sure it works in all these cases, as there is always a need for interpretation to decide whether the identified causes are genuine. I tried playing a bit with the toy data from <a href=http://bayes.cs.ucla.edu/R264.pdf target=_blank rel=nofollow>Pearl&rsquo;s report on Simpson&rsquo;s Paradox</a>, but the results are not entirely convincing. However, I&rsquo;m also not fully convinced that Pearl&rsquo;s solution fully resolves Simpson&rsquo;s Paradox, and Kleinberg does go through a few scenarios where her approach doesn&rsquo;t work in her book, so I&rsquo;d say that there are still quite a few open problems in the area.</p><p>Post-treatment causes are partly addressed by the definition in <a href=http://www.skleinberg.org/papers/huang_flairs15.pdf target=_blank rel=nofollow>Huang and Kleinberg (2015)</a>, where significance is weighted by the number of timepoints where <i>e</i> follows <i>c</i>. Again, that definition doesn&rsquo;t handle all cases, but I think it&rsquo;s an interesting line of research. I would definitely like to see their results reproduced by other researchers and expanded to other datasets, though.</p></div></div><div class=comment-level-0 id=comment-1201><div class=comment-header><a href=#comment-1201><img class=comment-avatar src="https://www.gravatar.com/avatar/6c7d9b2bba16f79b1900c5098386c383?s=50"><p class=comment-info><strong>Jose Magana</strong><br><small>2016-05-17 01:19:40</small></p></a></div><div class="comment-body post-content">Excellent article! It has been very useful to understand what the topic of causality is about and triggered my interest to continue learning more!</div></div><div class=comment-level-0 id=comment-1369><div class=comment-header><a href=#comment-1369><img class=comment-avatar src="https://www.gravatar.com/avatar/e7778c88395a420fcff68a1a31afedab?s=50"><p class=comment-info><strong>Rachel Lynne Wilkerson</strong><br><small>2016-12-08 11:26:40</small></p></a></div><div class="comment-body post-content">Thanks for this post! I share your troubles over Pearl/time/feedback loops!</div></div><div class=comment-level-0 id=comment-1480><div class=comment-header><a href=#comment-1480><img class=comment-avatar src="https://www.gravatar.com/avatar/3cbe692e74c8824a7db554f18efbccad?s=50"><p class=comment-info><strong>rs</strong><br><small>2017-03-23 06:57:16</small></p></a></div><div class="comment-body post-content">Nice post. have you had any chance to apply them on real datasets. Please share those results</div></div><div class=comment-level-0 id=comment-3185><div class=comment-header><a href=#comment-3185><img class=comment-avatar src="https://www.gravatar.com/avatar/c076c646b1ac8cebfbf9ebabf4362bed?s=50"><p class=comment-info><strong>ingorohlfing</strong><br><small>2018-12-30 19:33:12</small></p></a></div><div class="comment-body post-content">Great post. I did not know about Kleinberg and Hill&rsquo;s work. I knew a similar list of criteria from this article, which is much younger <a href=https://doi.org/10.1177%2F0951629805050859 target=_blank rel=noopener>https://doi.org/10.1177%2F0951629805050859</a>
 Regarding Kleinberg: Adding time certainly is valuable, but doesn&rsquo;t the smoking example change the research question from whether smoking causes lung cancer to when it causes lung cancer? The latter question is more informative and implies the former, but I&rsquo;d say it is fine to ask the first question when one is not interested in the time of occurrence of cancer.</div></div><div class=comment-level-1 id=comment-3189><div class=comment-header><a href=#comment-3189><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2018-12-31 01:25:05</small></p></a></div><div class="comment-body post-content">Thank you! I agree that the latter question is more informative, but I now think that saying that &ldquo;smoking causes cancer&rdquo; isn&rsquo;t particularly meaningful, as it ignores both timing and dosage. A good summary of the case for well-defined interventions was provided by Miguel Hernán in this paper: <a href=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5207342/ target=_blank rel=noopener>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5207342/</a></div></div><div class=comment-level-0 id=comment-21812><div class=comment-header><a href=#comment-21812><img class=comment-avatar src="https://www.gravatar.com/avatar/c19ebbee91d2fd5351fbc11cbb77c26e?s=50"><p class=comment-info><strong>locksley1</strong><br><small>2021-01-02 11:44:23</small></p></a></div><div class="comment-body post-content">The limits of Pearl&rsquo;s theory on feedback loops bothers me too. However, have you studied much Control Theory? Or dynamical systems in general? It explicitly deals with feedback loops. I&rsquo;d be keen to get your thoughts on the comparison of Control Theory vs Pearl&rsquo;s Causal Inference.</div></div><div class=comment-level-1 id=comment-21816><div class=comment-header><a href=#comment-21816><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2021-01-03 01:07:36</small></p></a></div><div class="comment-body post-content">Thanks for the comment! No, I haven&rsquo;t studied Control Theory. Maybe I&rsquo;ll look into it one day. :)</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2016/06/19/making-bayesian-ab-testing-more-accessible/index.html b/2016/06/19/making-bayesian-ab-testing-more-accessible/index.html
index d842d0b92..45ba5e427 100644
--- a/2016/06/19/making-bayesian-ab-testing-more-accessible/index.html
+++ b/2016/06/19/making-bayesian-ab-testing-more-accessible/index.html
@@ -14,7 +14,7 @@
 https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/calcualtor-recommendation-example.png 464w," src=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/calcualtor-recommendation-example.png alt="Calculator recommendation example" loading=lazy></a></figure><p><a href="https://yanirs.github.io/tools/split-test-calculator/#prior-mean=50,prior-uncertainty=57.74,minimum-effect=1,control-trials=200,control-successes=120,test-trials=200,test-successes=100" target=_blank rel=noopener>The full results</a> also include plots of the distributions and their high density intervals. I&rsquo;m pretty happy with the richer information provided by the calculator, though it still has some limitations and areas that can be improved.</p><h2 id=limitations-and-potential-improvements>Limitations and potential improvements<a hidden class=anchor aria-hidden=true href=#limitations-and-potential-improvements>#</a></h2><p>As mentioned above, I&rsquo;d love to reduce the wordiness of the calculator while keeping it self-contained, but I need some feedback to understand if any explanations are redundant. It&rsquo;d also be great to reduce the reliance on magic numbers, such as the 95% HDI and 0.8 precision used for generating a recommendation. However, making these settable by users would increase the complexity of using the calculator, which is already harder to use than the frequentist alternative. Nonetheless, it&rsquo;s important to remember that oversimplification is the reason why it&rsquo;s easier to make the wrong decision when following the classical approach.</p><p>Other potential changes include <a href=http://varianceexplained.org/r/bayesian_ab_baseball/ target=_blank rel=noopener>switching to a closed-form formula rather than draws from a distribution</a>, comparing more than two variants, and improving Kruschke&rsquo;s stopping rules by simulating more scenarios than those considered in his post. In addition, I&rsquo;d like to go beyond binary responses (success/failure) to support continuous rewards (e.g., revenue), and allow users to specify different costs for the variants (e.g., implementing B may cost more than sticking with A).</p><p>Finally, it is important to keep in mind that significance testing can&rsquo;t tell you whether your sample is representative of the population. For example, if you run an experiment on a very popular website, you can get a sample of thousands of people within a few minutes. Concluding an experiment based on such a sample is probably a bad idea, as it is plausible that you would reach different conclusions if you kept running the experiment for a few days, to reduce the effect that the time of day has on the results. Similarly, a few days may not be enough if your user population behaves differently on weekends – you would need to run the experiment over a few weeks. This can be extended to months and years to rule out seasonal effects, but it is up to the experimenter to weigh the practicality of considering such factors versus the need to make decisions (see articles by <a href=http://conversionxl.com/statistical-significance-does-not-equal-validity/ target=_blank rel=noopener>Peep Laja</a>, <a href=http://www.qubit.com/sites/default/files/pdf/mostwinningabtestresultsareillusory_0.pdf target=_blank rel=noopener>Martin Goodson</a>, <a href=https://blog.crazyegg.com/2016/03/22/anti-cookbook-ab-testing/ target=_blank rel=noopener>Sam Ju</a>, and <a href=http://www.exp-platform.com/Documents/2014%20experimentersRulesOfThumb.pdf target=_blank rel=noopener>Kohavi et al.</a> for more details). The main thing to remember is that <strong>you just cannot completely eliminate uncertainty and the need to consider background knowledge</strong>, which is why I believe that helping more people follow the Bayesian approach is a step in the right direction.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/split-testing/>Split Testing</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on x" href="https://x.com/intent/tweet/?text=Making%20Bayesian%20A%2fB%20testing%20more%20accessible&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f&amp;hashtags=analytics%2ccausalinference%2cdatascience%2csplittesting%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f&amp;title=Making%20Bayesian%20A%2fB%20testing%20more%20accessible&amp;summary=Making%20Bayesian%20A%2fB%20testing%20more%20accessible&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f&title=Making%20Bayesian%20A%2fB%20testing%20more%20accessible"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on whatsapp" href="https://api.whatsapp.com/send?text=Making%20Bayesian%20A%2fB%20testing%20more%20accessible%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on telegram" href="https://telegram.me/share/url?text=Making%20Bayesian%20A%2fB%20testing%20more%20accessible&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making Bayesian A/B testing more accessible on ycombinator" href="https://news.ycombinator.com/submitlink?t=Making%20Bayesian%20A%2fB%20testing%20more%20accessible&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f06%2f19%2fmaking-bayesian-ab-testing-more-accessible%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1236><div class=comment-header><a href=#comment-1236><img class=comment-avatar src="https://www.gravatar.com/avatar/56f2e2c3d5f195f0d70e78194731f571?s=50"><p class=comment-info><strong>John Chew</strong><br><small>2016-06-20 00:23:50</small></p></a></div><div class="comment-body post-content">My hunch is that the % of HDI chosen is of less interest to a user than seeing how each test iteration alters the HDI and shifts the level of overlap between HDI and ROPE toward one of either outcome. In the example given above, would a fair interpretation be that the differences appear weighted more toward the negative than the positive? With precision, shouldn&rsquo;t it be made a function of the minimum effect requested? Larger ROPE require less precision and vice versa?</div></div><div class=comment-level-1 id=comment-1238><div class=comment-header><a href=#comment-1238><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-06-20 12:37:25</small></p></a></div><div class="comment-body post-content">Thanks for your comment, John! I think that it appears weighted more towards the negative because the beta distribution is symmetric when the mean is 0.5 (alpha = beta), and asymmetric in other cases, making it less pointy. According to Kruschke&rsquo;s simulations, using the precision stopping rule makes the success rate estimate closer to the true mean of the underlying distribution than with other stopping rules, which tend to overestimate the success rate. I&rsquo;m not sure we&rsquo;d get the same results if precision were a function of the minimum effect, but I&rsquo;d like to run more simulations to get a better feeling for how it works.</div></div><div class=comment-level-0 id=comment-1481><div class=comment-header><a href=#comment-1481><img class=comment-avatar src="https://www.gravatar.com/avatar/7618c8563a001f0ecc8c661d81d72532?s=50"><p class=comment-info><strong>Сергей Филиппов</strong><br><small>2017-03-23 18:11:43</small></p></a></div><div class="comment-body post-content">Could you tell me please how do you calculate HDI and ROPE?
 I am trying to replicate this calculator in R.
 Thanks!</div></div><div class=comment-level-1 id=comment-1491><div class=comment-header><a href=#comment-1491><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2017-03-31 08:00:23</small></p></a></div><div class="comment-body post-content">The source code for the calculation is here: <a href=https://github.com/yanirs/yanirs.github.io/blob/master/tools/split-test-calculator/src/bayes.coffee#L139 target=_blank rel=noopener>https://github.com/yanirs/yanirs.github.io/blob/master/tools/split-test-calculator/src/bayes.coffee#L139</a> &ndash; it shouldn&rsquo;t be too hard to translate to R.</div></div><div class=comment-level-0 id=comment-1483><div class=comment-header><a href=#comment-1483><img class=comment-avatar src="https://www.gravatar.com/avatar/a7eaeb8a80e037342bf6d5e2e0e18f82?s=50"><p class=comment-info><strong>Sam Gil</strong><br><small>2017-03-24 17:23:47</small></p></a></div><div class="comment-body post-content"><p>Thanks for the post!</p><p>I&rsquo;m more of a business stakeholder simply trying to improve our testing practices, rather than a data scientist who understands the theories at a detailed level.</p><p>I&rsquo;m a bit confused why, if I enter the default example in your calculator (5000 trials each, 100 successes vs 130), the recommendation is to implement EITHER variant.</p><p>Whereas, using a tool such as the following suggests a 97.8% chance the variant with 130 successes will outperform the control: <a href=https://abtestguide.com/bayesian/ target=_blank rel=noopener>https://abtestguide.com/bayesian/</a></p><p>This calculator also seems to suggest the 130 successes variant should be chosen, not EITHER, as there is 95% confidence the result is not due to chance : <a href=https://abtestguide.com/calc/ target=_blank rel=noopener>https://abtestguide.com/calc/</a></p><p>A secondary question is, if there is no predetermined sample size with the Bayesian approach, how do you plan how long to run the test for? Mainly to deal with stakeholder communication & project planning, but also to avoid peaking.</p><p>Many thanks,
diff --git a/2016/08/04/is-data-scientist-a-useless-job-title/index.html b/2016/08/04/is-data-scientist-a-useless-job-title/index.html
index 6925655c5..18a0a715b 100644
--- a/2016/08/04/is-data-scientist-a-useless-job-title/index.html
+++ b/2016/08/04/is-data-scientist-a-useless-job-title/index.html
@@ -8,7 +8,7 @@
 https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/science-versus-engineering.png 581w," src=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/science-versus-engineering.png alt="Information flow in science and engineering" loading=lazy></a></figure><h2 id=why-data-scientist-is-a-useless-job-title>Why Data Scientist is a useless job title<a hidden class=anchor aria-hidden=true href=#why-data-scientist-is-a-useless-job-title>#</a></h2><p>Given that a data scientist is someone who does data analysis, and/or a scientist, and/or an engineer, what does it mean for a person to hold a Data Scientist position? It can mean anything, as it depends on the company and industry. A job title like Data Scientist at Company is about as meaningful as Engineer at Organisation, Scientist at Institution, or Doctor at Hospital. It gives you a general idea what the person&rsquo;s background is, but provides little clue as to what the person actually does on a day-to-day basis.</p><p>Don&rsquo;t believe me? Let&rsquo;s look at a few examples. <a href=https://m.signalvnoise.com/data-scientists-mostly-just-do-arithmetic-and-that-s-a-good-thing-c6371885f7f6 target=_blank rel=noopener>Noah Lorang (Basecamp)</a> is OK with mostly doing arithmetic. <a href=http://varianceexplained.org/r/year_data_scientist/ target=_blank rel=noopener>David Robinson (Stack Overflow)</a> builds machine learning features and internal R packages, and visualises data. <a href=https://medium.com/@rchang/my-two-year-journey-as-a-data-scientist-at-twitter-f0c13298aee6 target=_blank rel=noopener>Robert Chang (Twitter)</a> helps surface product insights, create data pipelines, run A/B tests, and build predictive models. <a href=http://robjhyndman.com/hyndsight/am-i-a-data-scientist/ target=_blank rel=noopener>Rob Hyndman (Monash University)</a> and <a href=http://staff.washington.edu/jakevdp/ target=_blank rel=noopener>Jake VanderPlas (University of Washington)</a> are academic data scientists who contribute to major R and Python open-source libraries, respectively. From personal knowledge, data scientists in many Australian enterprises focus on generating reports and building dashboards. And in my current role at <a href=https://www.carnextdoor.com.au target=_blank rel=noopener>Car Next Door</a> I do a little bit of everything, e.g., implement new features, fix bugs, set up data pipelines and dashboards, run experiments, build predictive models, and analyse data.</p><p>To be clear, the work done by many data scientists is very useful. The number of decisions made based on arbitrary thresholds and <a href=https://blog.kissmetrics.com/how-to-calculate-lifetime-value/ target=_blank rel="nofollow noopener">some means multiplied together on a spreadsheet</a> can be horrifying to those of us with minimal knowledge of basic statistics. Having a good data scientist on board can have a transformative effect on a business. But it&rsquo;s also very easy to end up with ineffective hires working on low-impact tasks if <a href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/>the business has no idea what their data scientists should be doing</a>. This situation isn&rsquo;t uncommon, given the wide range of activities that <em>may</em> be performed by data scientists, the lack of consensus on the definition of the field, and <a href=https://www.quora.com/What-are-20-questions-to-detect-fake-data-scientists target=_blank rel=noopener>a general disagreement over who deserves to be called a <i>real</i> data scientist</a>. We need to move beyond the hype towards clearer definitions that would help align the expectations of data scientists with those of their current and future employers.</p><h2 id=its-time-to-specialise>It&rsquo;s time to specialise<a hidden class=anchor aria-hidden=true href=#its-time-to-specialise>#</a></h2><p>Four years ago, I changed my LinkedIn title from <em>software engineer with a research background</em> to <em>data scientist</em>. Various offers started coming my way, and they haven&rsquo;t stopped since. Many people have done the same. To be a data scientist, you just need to call yourself a data scientist. The dilution of the term means that as a job title, it is useless. Useless terms are unlikely to last, so if you&rsquo;re seriously thinking of <a href=https://www.experfy.com/blog/how-to-become-a-data-scientist-part-1-3 target=_blank rel=noopener>becoming a data scientist</a>, you should also consider specialising. I believe we&rsquo;ll see the emergence of new specific titles, such as <a href=http://machinelearningmastery.com/machine-learning-for-programmers/ target=_blank rel=noopener>Machine Learning Engineer</a>. In addition, less &ldquo;sexy&rdquo; titles, such as Data Analyst, may end up making a comeback. In any case, those of us who invest in building their skills, delivering value in their job, and <a href=http://analyticsmadeskeezy.com/2012/11/05/check-yo-self-5-things-you-should-know-about-data-science-author-note/ target=_blank rel=noopener>making sure people know about it</a> don&rsquo;t have much to worry about.</p><p>What do you think? Is specialisation inevitable or are generalist data scientists here to stay? Please let me know <a href=https://yanirseroussi.com/about/>privately</a>, via <a href=https://twitter.com/yanirseroussi target=_blank rel=noopener>Twitter</a>, or in the comments section.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on x" href="https://x.com/intent/tweet/?text=Is%20Data%20Scientist%20a%20useless%20job%20title%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f&amp;hashtags=business%2cdatascience%2cmarketing%2csoftwareengineering%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f&amp;title=Is%20Data%20Scientist%20a%20useless%20job%20title%3f&amp;summary=Is%20Data%20Scientist%20a%20useless%20job%20title%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f&title=Is%20Data%20Scientist%20a%20useless%20job%20title%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on whatsapp" href="https://api.whatsapp.com/send?text=Is%20Data%20Scientist%20a%20useless%20job%20title%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on telegram" href="https://telegram.me/share/url?text=Is%20Data%20Scientist%20a%20useless%20job%20title%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is Data Scientist a useless job title? on ycombinator" href="https://news.ycombinator.com/submitlink?t=Is%20Data%20Scientist%20a%20useless%20job%20title%3f&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f04%2fis-data-scientist-a-useless-job-title%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1264><div class=comment-header><a href=#comment-1264><img class=comment-avatar src="https://www.gravatar.com/avatar/5407b8cd4a436b92d1c05d43ecc85041?s=50"><p class=comment-info><strong>justalanm</strong><br><small>2016-08-05 07:45:57</small></p></a></div><div class="comment-body post-content"><p>I think exactly the same, but for the moment the title is somewhat needed. I&rsquo;ll make my example: I&rsquo;m a data scientist because my company wants to differentiate between regular data analysts (who can&rsquo;t code but are learning with me helping them) and backend software engineers who can code better than me, but lack the business knowledge and have the tendency to trow fancy algorithms at numbers without thinking about method and usefulness for the business.</p><p>Eventually we will have new job titles, but for now we are stuck with &ldquo;data scientists&rdquo;. As soon as the hype will fade we&rsquo;ll see people moving to new titles.</p></div></div><div class=comment-level-0 id=comment-1279><div class=comment-header><a href=#comment-1279><img class=comment-avatar src="https://www.gravatar.com/avatar/6ba16d48bddb8c4d7277a579d54dc81f?s=50"><p class=comment-info><strong>wesleypasfield</strong><br><small>2016-08-09 21:33:26</small></p></a></div><div class="comment-body post-content">Great article - and really the ambiguity surrounding the Data Scientist title hurts everyone - Data Scientists are frustrated that they&rsquo;re expected to do everything, and others are frustrated that their Data Scientists can&rsquo;t do everything that they&rsquo;ve heard data scientists can do. I think this will change over time as data scientists (or whatever they will be called) roles get further defined.</div></div><div class=comment-level-0 id=comment-1282><div class=comment-header><a href=#comment-1282><img class=comment-avatar src="https://www.gravatar.com/avatar/c21c9f89f98c1234da4a0ab04d1f43fa?s=50"><p class=comment-info><strong>Ian Miller</strong><br><small>2016-08-15 15:36:44</small></p></a></div><div class="comment-body post-content"><p>Good article - I&rsquo;ve always had a bit of problem with the term &ldquo;Data Scientist&rdquo; in that it infers that the person with such a title is somehow involved in research of data for scientific purposes or has a deep academic background neither of which are usually true.</p><p>Now, what does everyone think about the term &ldquo;Data Architect&rdquo; - I could do a rant on that one. Suffice it to say, you are a DATABASE Architect, not a DATA architect. Data is data, it is just raw numbers. You don&rsquo;t design data, you design a data model which eventually gets translated into a database. Sorry, I guess that was a bit of a rant &mldr;</p></div></div><div class=comment-level-1 id=comment-1283><div class=comment-header><a href=#comment-1283><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-08-15 19:34:35</small></p></a></div><div class="comment-body post-content">Yeah, people like the word data more than the word database these days. There are also the various places where you can put data. You can drown it in a lake, for example&mldr;</div></div><div class=comment-level-0 id=comment-1297><div class=comment-header><a href=#comment-1297><img class=comment-avatar src="https://www.gravatar.com/avatar/b158e430e125ad3e21154c084836bf00?s=50"><p class=comment-info><strong>Jonathan Sunderland</strong><br><small>2016-08-31 06:49:40</small></p></a></div><div class="comment-body post-content"><p>Great article, and couldn&rsquo;t agree more - there&rsquo;s deep irony in &ldquo;Data Science&rdquo; as a job title.</p><p>I&rsquo;ve started to use the term &ldquo;Entrepreneurial Analyst&rdquo; to be more precise about the focus on the outcome and also to allow latitude for hypotheses, exploration and discovery.</p></div></div><div class=comment-level-0 id=comment-1304><div class=comment-header><a href=#comment-1304><img class=comment-avatar src="https://www.gravatar.com/avatar/c5559f9bf5c0674bd54569dff44c9482?s=50"><p class=comment-info><strong>Elena</strong><br><small>2016-09-08 15:13:14</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=http://codefying.com/2016/09/08/is-data-scientist-a-useless-job-title/ rel=nofollow>codefying</a> and commented:
 Especially like the antiparallel structure of scientific inquiry and engineering design.</div></div><div class=comment-level-0 id=comment-1455><div class=comment-header><a href=#comment-1455><img class=comment-avatar src="https://www.gravatar.com/avatar/c18306e206675ea7b48bf7e702ea3357?s=50"><p class=comment-info><strong>Bernard Clyde</strong><br><small>2017-02-21 21:21:37</small></p></a></div><div class="comment-body post-content">I find it interest that, as you said, data analysis is very useful if done by effective hires. It&rsquo;s important to understand the data that you are provided with and what context it fits in. Otherwise, you could come to conclusions that miss the mark. It is important to have those who are properly qualified analyze data accurately.</div></div><div class=comment-level-0 id=comment-1658><div class=comment-header><a href=#comment-1658><img class=comment-avatar src="https://www.gravatar.com/avatar/30f21665196015869d2746e33bb52f65?s=50"><p class=comment-info><strong>aileenscott604</strong><br><small>2017-07-17 11:47:15</small></p></a></div><div class="comment-body post-content">Great article! I also feel that anybody with an experience of 5 years and more in Data Science, can be considered for a Data Scientist role. Professionals with lesser experience can always have their roles as Data Analysts or Data Engineers. There are many certification programs that can provide you with relevant skill-sets, such as Hortonworks certification, Cloudera Certifications, Data Science Council of Ameria (DASCA) certifications etc.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/index.html b/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/index.html
index 7fccfcff8..2cd65223c 100644
--- a/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/index.html
+++ b/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/index.html
@@ -30,7 +30,7 @@
 https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/sexy-hal-varian.jpg 850w," src=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/sexy-hal-varian_hu70cdaa620836d0cecb2e61d00c2468e8_62487_800x0_resize_q75_box.jpg alt="Hal Varian sexy statistician quote" loading=lazy></a></figure><p>Once you&rsquo;ve recognised your skill gaps, you may decide to hire a data scientist to help you get more value out of your data. However, despite the hype, data scientists are not magicians. In fact, because of the hype, the definition of data science is so diluted that some people say that <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>the term itself has become useless</a>. The truth is that dealing with data is hard, every organisation is somewhat different, and it takes time and commitment to get value out of data. The worst thing you can do is to hire an expensive expert to help you, and then ignore their advice when their findings are hard to digest. If you&rsquo;re <a href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/>not ready to work with a data scientist</a>, you might as well save yourself some money and remain in a state of blissful ignorance.</p><p><small><b>Note:</b> This article is not a portrayal of how things are with my current employer, Car Next Door. Views expressed are my own. In fact, if you want to work at a place where expert advice is acted on and uncertainty is seen as something to be studied rather than ignored, <a href=https://www.carnextdoor.com.au/careers target=_blank rel=noopener>we&rsquo;re hiring</a>!</small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on x" href="https://x.com/intent/tweet/?text=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f&amp;hashtags=analytics%2cbusiness%2cdatascience%2cmarketing%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f&amp;title=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff&amp;summary=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f&title=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on whatsapp" href="https://api.whatsapp.com/send?text=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on telegram" href="https://telegram.me/share/url?text=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share If you don’t pay attention, data can drive you off a cliff on ycombinator" href="https://news.ycombinator.com/submitlink?t=If%20you%20don%e2%80%99t%20pay%20attention%2c%20data%20can%20drive%20you%20off%20a%20cliff&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f08%2f21%2fseven-ways-to-be-data-driven-off-a-cliff%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1291><div class=comment-header><a href=#comment-1291><img class=comment-avatar src="https://www.gravatar.com/avatar/ea4a09f32fa08e754f344887617af6ba?s=50"><p class=comment-info><strong>Benoit Bernard</strong><br><small>2016-08-22 15:26:34</small></p></a></div><div class="comment-body post-content">Thanks Yanir for this post! Once again, you hit the nail on the head! We&rsquo;re probably all guilty of doing any number of those mistakes at one point or another of our careers. And it wouldn&rsquo;t surprise me that a lot of companies are doing all of those mistakes at the same time. I especially liked #6. Instead of stupidity, I would suggest that that ego is responsible for it.</div></div><div class=comment-level-1 id=comment-1292><div class=comment-header><a href=#comment-1292><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2016-08-22 20:23:29</small></p></a></div><div class="comment-body post-content">Yeah, I think that Bertrand Russell was a bit too harsh &ndash; it&rsquo;s really ignorance that often causes overconfidence rather than stupidity. And yes, I have made this mistake as well. Many things often look misleadingly simple if you don&rsquo;t get into the fine details.</div></div><div class=comment-level-0 id=comment-1296><div class=comment-header><a href=#comment-1296><img class=comment-avatar src="https://www.gravatar.com/avatar/c0238abfa555b342511a5a1e767b123d?s=50"><p class=comment-info><strong>Sofiya</strong><br><small>2016-08-30 06:43:22</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=http://sofiya.ca/2016/08/30/if-you-dont-pay-attention-data-can-drive-you-off-a-cliff/ rel=nofollow>Sofiya</a>.</div></div><div class=comment-level-0 id=comment-1298><div class=comment-header><a href=#comment-1298><img class=comment-avatar src="https://www.gravatar.com/avatar/c4efdc6c18b3c59d382fadf74187a813?s=50"><p class=comment-info><strong>Matthias Willerich</strong><br><small>2016-08-31 09:05:01</small></p></a></div><div class="comment-body post-content">Reblogged this on <a href=https://qanotesblog.wordpress.com/2016/08/31/if-you-dont-pay-attention-data-can-drive-you-off-a-cliff/ rel=nofollow>QA-notes</a> and commented:
 All common sense, but as with many things, having it written down focusses the mind :-)</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/index.html b/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/index.html
index b0ed6699f..344f8a4eb 100644
--- a/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/index.html
+++ b/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="causal inference,data science,insights,personal"><meta name=description content="Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what & how of predictive modelling."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Ask Why! Finding motives, causes, and purpose in data science"><meta property="og:description" content="Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what & how of predictive modelling."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/"><meta property="og:image" content="https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2016-09-19T21:28:44+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large.jpg"><meta name=twitter:title content="Ask Why! Finding motives, causes, and purpose in data science"><meta name=twitter:description content="Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what & how of predictive modelling."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Ask Why! Finding motives, causes, and purpose in data science","item":"https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Ask Why! Finding motives, causes, and purpose in data science","name":"Ask Why! Finding motives, causes, and purpose in data science","description":"Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what \u0026amp; how of predictive modelling.","keywords":["causal inference","data science","insights","personal"],"articleBody":"Some people equate predictive modelling with data science, thinking that mastering various machine learning techniques is the key that unlocks the mysteries of the field. However, there is much more to data science than the What and How of predictive modelling. I recently gave a talk where I argued the importance of asking Why, touching on three different topics: stakeholder motives, cause-and-effect relationships, and finding a sense of purpose. A video of the talk is available below. Unfortunately, the videographer mostly focused on me pacing rather than on the screen, but you can check out the slides here (note that you need to use both the left/right and up/down arrows to see all the slides).\nIf you’re interested in the topics covered in the talk, here are a few posts you should read.\nStakeholders and their motives\nIf you don’t pay attention, data can drive you off a cliff The hardest parts of data science You don’t need a data scientist (yet) Causality and experimentation\nMaking Bayesian A/B testing more accessible Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions Why you should stop worrying about deep learning and deepen your understanding of causality instead Purpose, ethics, and my personal path\nShould data science really do that? (on KDNuggets) The long road to a lifestyle business My divestment from fossil fuels The rise of greedy robots Cover image: Why by Ksayer\n","wordCount":"232","inLanguage":"en","image":"https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large.jpg","datePublished":"2016-09-19T21:28:44Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Ask Why! Finding motives, causes, and purpose in data science</h1><div class=post-meta><span title='2016-09-19 21:28:44 +0000 UTC'>September 19, 2016</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large_hu3d03a01dcc18bc5be0e67db3d8d209a6_279085_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large_hu3d03a01dcc18bc5be0e67db3d8d209a6_279085_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large_hu3d03a01dcc18bc5be0e67db3d8d209a6_279085_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large.jpg 1024w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/why-brick-wall-large.jpg alt width=1024 height=685></figure><div class=post-content><p>Some people equate predictive modelling with data science, thinking that mastering various machine learning techniques is the key that unlocks the mysteries of the field. However, there is much more to data science than the <em>What</em> and <em>How</em> of predictive modelling. I recently gave a talk where I argued the importance of asking <em>Why</em>, touching on three different topics: stakeholder motives, cause-and-effect relationships, and finding a sense of purpose. <a href="http://www.youtube.com/watch?v=2wqu-drqlpo" target=_blank rel=noopener>A video of the talk</a> is available below. Unfortunately, the videographer mostly focused on me pacing rather than on the screen, but you can <a href=https://yanirs.github.io/talks/ask-why/ target=_blank rel=noopener>check out the slides here</a> (note that you need to use both the left/right and up/down arrows to see all the slides).</p><p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/2wqu-drqlpo?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="YouTube video"></iframe></div></p><p>If you&rsquo;re interested in the topics covered in the talk, here are a few posts you should read.</p><p><strong>Stakeholders and their motives</strong></p><ul><li><a href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/>If you don&rsquo;t pay attention, data can drive you off a cliff</a></li><li><a href=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/>The hardest parts of data science</a></li><li><a href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/>You don&rsquo;t need a data scientist (yet)</a></li></ul><p><strong>Causality and experimentation</strong></p><ul><li><a href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/>Making Bayesian A/B testing more accessible</a></li><li><a href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</a></li><li><a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>Why you should stop worrying about deep learning and deepen your understanding of causality instead</a></li></ul><p><strong>Purpose, ethics, and my personal path</strong></p><ul><li><a href=http://www.kdnuggets.com/2015/05/should-data-science-do-that.html target=_blank rel=noopener>Should data science really do that? (on KDNuggets)</a></li><li><a href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/>The long road to a lifestyle business</a></li><li><a href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/>My divestment from fossil fuels</a></li><li><a href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/>The rise of greedy robots</a></li></ul><p><small>Cover image: <a href=https://flic.kr/p/9yaos5 target=_blank rel=noopener>Why by Ksayer</a></small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/insights/>Insights</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on x" href="https://x.com/intent/tweet/?text=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f&amp;hashtags=causalinference%2cdatascience%2cinsights%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f&amp;title=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science&amp;summary=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science&amp;source=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f&title=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on whatsapp" href="https://api.whatsapp.com/send?text=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science%20-%20https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on telegram" href="https://telegram.me/share/url?text=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science&amp;url=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Ask Why! Finding motives, causes, and purpose in data science on ycombinator" href="https://news.ycombinator.com/submitlink?t=Ask%20Why%21%20Finding%20motives%2c%20causes%2c%20and%20purpose%20in%20data%20science&u=https%3a%2f%2fyanirseroussi.com%2f2016%2f09%2f19%2fask-why-finding-motives-causes-and-purpose-in-data-science%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/index.html b/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/index.html
index c8b2baa4a..d13ffdabb 100644
--- a/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/index.html
+++ b/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/index.html
@@ -11,7 +11,7 @@
 https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/test-your-models.jpg 430w," src=https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/test-your-models.jpg alt="If you don't test your models, you're gonna have a bad time" loading=lazy></a></figure><h2 id=conclusion-youre-better-than-that>Conclusion: You&rsquo;re better than that<a hidden class=anchor aria-hidden=true href=#conclusion-youre-better-than-that>#</a></h2><p>Accurate estimation of customer lifetime value is crucial to most businesses. It informs decisions on customer acquisition and retention, and getting it wrong can drive a business from profitability to insolvency. The rise of data science increases the availability of statistical and scientific tools to small and large businesses. Hence, there are few reasons why a revenue-generating business should rely on untested customer value formulas rather than on more realistic models. This extends beyond customer value to nearly every business endeavour: <a href=https://www.linkedin.com/pulse/how-identify-your-marketing-lies-start-telling-truth-tiberio-caetano target=_blank rel=noopener>Relying on fabrications is not a sustainable growth strategy</a>, there is no way around <a href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/>learning how to be intelligently driven by data</a>, and no amount of cheap demagoguery and misinformation can alter the objective reality of our world.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/politics/>Politics</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li><li><a href=https://yanirseroussi.com/tags/science-communication/>Science Communication</a></li><li><a href=https://yanirseroussi.com/tags/search-engine-optimisation/>Search Engine Optimisation</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on x" href="https://x.com/intent/tweet/?text=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f&amp;hashtags=analytics%2cbusiness%2cdatascience%2cmarketing%2cpolitics%2cpredictivemodelling%2csciencecommunication%2csearchengineoptimisation%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f&amp;title=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet&amp;summary=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet&amp;source=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f&title=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on whatsapp" href="https://api.whatsapp.com/send?text=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet%20-%20https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on telegram" href="https://telegram.me/share/url?text=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Customer lifetime value and the proliferation of misinformation on the internet on ycombinator" href="https://news.ycombinator.com/submitlink?t=Customer%20lifetime%20value%20and%20the%20proliferation%20of%20misinformation%20on%20the%20internet&u=https%3a%2f%2fyanirseroussi.com%2f2017%2f01%2f08%2fcustomer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1420><div class=comment-header><a href=#comment-1420><img class=comment-avatar src="https://www.gravatar.com/avatar/76347f8703adb9394f74bf150edb9b19?s=50"><p class=comment-info><strong>Ralph Haygood</strong><br><small>2017-01-09 21:09:02</small></p></a></div><div class="comment-body post-content"><p>When I started doing data science in a business setting (after years of doing quantitative genetics in academic settings), I was puzzled by the talk of &ldquo;customer lifetime value&rdquo;, partly due to the issues you&rsquo;ve mentioned. Even with appropriate clarifications - it&rsquo;s over, say, five years, not forever, etc. - it&rsquo;s a peculiar quantity, in that as typically calculated, at least by the business-y people in my vicinity, it isn&rsquo;t an average over a single population of customers. Instead, it&rsquo;s average, say, first-month net present value over customers who&rsquo;ve been around for at least one month (or maybe over all such customers who&rsquo;ve also been around for at most one year, to reduce the influence of customer behavior farther in the past, when the product catalog, marketing strategies, etc. were different), plus average second-month net present value over customers who&rsquo;ve been around for at least two months, etc., that is, it&rsquo;s a sum of averages over a sequence of populations of customers (which may not even be nested). And there can be further subtleties. For example, in the context of a &ldquo;freemium&rdquo; service such as the one that is my primary client at present, sometimes people want to measure time from when a customer signs up for an account, whereas other times people want to measure time from when a customer first buys something, which may be much later. Altogether, I&rsquo;ve found that &ldquo;customer lifetime value&rdquo; generally requires a good deal of explanation.</p><p>&ldquo;no amount of cheap demagoguery and misinformation can alter the objective reality of our world.&rdquo;: Alas, that isn&rsquo;t quite true. Next week, the objective reality of how the USA is governed will be altered substantially, partly due to blatant demagoguery and misinformation.</p></div></div><div class=comment-level-1 id=comment-1422><div class=comment-header><a href=#comment-1422><img class=comment-avatar src="https://www.gravatar.com/avatar/4807d1d5cd217b7661ee3af3fa033c8b?s=50"><p class=comment-info><strong>Jay</strong><br><small>2017-01-10 20:00:53</small></p></a></div><div class="comment-body post-content">Great analysis Yanir!</div></div><div class=comment-level-1 id=comment-1423><div class=comment-header><a href=#comment-1423><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2017-01-10 20:24:33</small></p></a></div><div class="comment-body post-content">Thanks Ralph! I meant the last sentence in the sense of &ldquo;<a href=https://en.wikipedia.org/wiki/And_yet_it_moves target=_blank rel=nofollow>and yet it moves</a>&rdquo;. People&rsquo;s actions and choices are definitely affected by demagoguery and misinformation, but the spread of misinformation doesn&rsquo;t change reality by itself. For example, Trump et al.&rsquo;s climate science denialism isn&rsquo;t going to alter the reality of anthropogenic climate change, though their actions are probably going to accelerate it.</div></div><div class=comment-level-0 id=comment-1427><div class=comment-header><a href=#comment-1427><img class=comment-avatar src="https://www.gravatar.com/avatar/859ccee9b0622bdfa6aeee7bf02057ea?s=50"><p class=comment-info><strong>Çağrı Sarıgöz</strong><br><small>2017-01-12 16:14:34</small></p></a></div><div class="comment-body post-content"><p>This is why Investment Banking and Venture Capital firms should hire Data Scientists.</p><p>I think your post and the links you share can have a part on the Google search results as well in near future :)</p></div></div><div class=comment-level-0 id=comment-1512><div class=comment-header><a href=#comment-1512><img class=comment-avatar src="https://www.gravatar.com/avatar/db40285205b652a740f4086f6bd1ca3f?s=50"><p class=comment-info><strong>Ben</strong><br><small>2017-04-24 01:53:40</small></p></a></div><div class="comment-body post-content"><p>Great post.</p><p>There&rsquo;s also the BTYD package in R that I&rsquo;ve seen be used for CLV calculations although I don&rsquo;t know if it could be used for anything industrial. All credit for this knowledge goes to Dan McCarthy, who just put out some great research on using CLV in non-contractual settings.</p></div></div><div class=comment-level-0 id=comment-2179><div class=comment-header><a href=#comment-2179><img class=comment-avatar src="https://www.gravatar.com/avatar/98814b3d5666a1c2a8085bba2f458d43?s=50"><p class=comment-info><strong>Eleni M</strong><br><small>2018-02-01 13:58:15</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir!</p><p>Nice post.</p><p>How can the models you mentioned be altered in the case of a subscription based business in order to calculate the lifetime value of the customers?</p></div></div><div class=comment-level-1 id=comment-2182><div class=comment-header><a href=#comment-2182><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2018-02-02 09:56:00</small></p></a></div><div class="comment-body post-content">Thanks Eleni! I think that in the case of subscription-based products, you&rsquo;re better off using different models, as churn is observed and can be predicted (e.g., using a package like <a href=https://github.com/CamDavidsonPilon/lifelines rel=nofollow>lifelines</a>). Once you have an estimate of when a customer is going to churn, it&rsquo;s easy to estimate their LTV (assuming constant recurring revenue). In any case, the general principle of not using closed formulas without testing their accuracy on your data still applies here.</div></div><div class=comment-level-0 id=comment-1><div class=comment-header><a href=#comment-1><img class=comment-avatar src="https://avatars.githubusercontent.com/u/46935893?s=50&v=4"><p class=comment-info><strong>timothyjwhite20</strong><br><small>2022-07-09 16:38:07</small></p></a></div><div class="comment-body post-content">Thanks for the article, Yanir! I am a huge proponent of only using formulas for CLV as a starting point, used only when historical similar models aren&rsquo;t available, When a good historical financial model is available, it becomes much more useful than the generic formula. I was just speaking with service vendor who was trying to convince to allow his company to perform exhaustive FMEA&rsquo;s on all of our equipment when we had years of failure data to approach a maintenance strategy. Only rely on theoretical when empirical isn&rsquo;t an option.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2017/06/03/exploring-and-visualising-reef-life-survey-data/index.html b/2017/06/03/exploring-and-visualising-reef-life-survey-data/index.html
index e0a1a8bf5..0e995818f 100644
--- a/2017/06/03/exploring-and-visualising-reef-life-survey-data/index.html
+++ b/2017/06/03/exploring-and-visualising-reef-life-survey-data/index.html
@@ -12,7 +12,7 @@
 https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/reef-life-survey-flashcards-screenshot_hu0ecf79dd903f58a2cad6274428c21648_857114_1500x0_resize_box_3.png 1500w," src=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/reef-life-survey-flashcards-screenshot_hu0ecf79dd903f58a2cad6274428c21648_857114_800x0_resize_box_3.png alt="Reef Life Survey Flashcards screenshot" loading=lazy></a><figcaption><p>The <strong>Flashcards</strong> tool helps users memorise the names of marine species by showing random images of species from a chosen area (<a href=http://reeflifesurvey.com/flashcards target=_blank rel=noopener>RLS website</a> | <a href=https://yanirs.github.io/tools/rls/flashcards/ target=_blank rel=noopener>full-screen version</a>).</p></figcaption></figure><h2 id=the-data>The data<a hidden class=anchor aria-hidden=true href=#the-data>#</a></h2><p>The RLS database includes data collected by volunteer scuba divers on the diversity and abundance of marine life in sites around the world. An RLS survey is performed along a 50 metre tape, which is laid at a constant depth following a reef&rsquo;s contour. After laying the tape, one diver takes photos of the bottom at 2.5 metre intervals along the transect line. These photos are analysed later to classify the type of substrate or growth (e.g., hard coral or sand). Divers then complete two swims along each side of the transect. On the first swim (method 1), divers record all the fish species and large swimming animals found in a 5 metre corridor from the line. The second swim (method 2) targets invertebrates and cryptic animals, and requires keeping closer to the bottom and looking under ledges and vegetation in a 1 metre corridor from the line. The <a href=http://reeflifesurvey.com/wp-content/uploads/2015/07/NEW-Methods-Manual_150815.pdf target=_blank rel=noopener>RLS manual</a> includes all the details on how surveys are performed. The data collected in the surveys is available for download from a <a href=http://reeflifesurvey.imas.utas.edu.au/static/landing.html target=_blank rel=noopener>Data Portal hosted by the Institute for Marine and Antarctic Studies at the University of Tasmania</a>. As of early June 2017, the downloadable dataset consists of over half a million data points from almost ten thousand surveys.</p><p>When I first started studying marine species, I had to find a source for photos. Initially, I used <a href=https://scrapy.org/ target=_blank rel=noopener>Scrapy</a> to build simple scrapers that downloaded photos from sites such as <a href=https://australianmuseum.net.au/animals target=_blank rel=noopener>The Australian Museum</a>, <a href=http://www.fishbase.org/ target=_blank rel=noopener>Fishbase</a>, and <a href=http://fishesofaustralia.net.au/ target=_blank rel=noopener>Fishes of Australia</a>. Last year, RLS made a large number of high-quality photos taken by volunteers available on their site (via the <a href=http://reeflifesurvey.com/species-search/ target=_blank rel=noopener>Species Search function</a>). In addition to their high quality, an advantage of the RLS photos over images from other sources is that they were all taken <em>in situ</em>, i.e., in each animal&rsquo;s natural habitat. On the other hand, other sites also include photos of dissections and hand-drawn illustrations, which aren&rsquo;t as useful for divers who want to see marine animals as they appear in the wild. Working exclusively with the RLS image dataset has significantly improved the appearance and usefulness of the tools I built.</p><p>The raw RLS survey data comes in the form of over 100MB of CSV files. For the purpose of building the tools, I summarised the data into two JSON files with an overall size of less than 3MB (less than 1MB when compressed). This made it possible to implement both tools as single-page apps that don&rsquo;t require any requests to the server after the initial fetching of the data. The two summary JSONs are:</p><ul><li><code>species.json</code> – a mapping from species ID to an array of five elements: scientific name, common name, species page URL, survey method (0: method 1, 1: method 2, or 2: both), and images (array of URLs).</li><li><code>site-surveys.json</code> – a mapping from site code to an array of seven elements: <a href=https://en.wikipedia.org/wiki/List_of_marine_ecoregions target=_blank rel=noopener>realm, ecoregion</a>, site name, longitude, latitude, number of surveys, and species counts (mapping from each observed species ID to the number of surveys on which it was seen).</li></ul><p>Both files use mappings to arrays rather than nested objects to reduce the download size. I originally created the files myself by downloading the CSVs from the data portal and scraping the RLS website for images and common names. Static versions of those files from early June 2017 can be found on GitHub (<a href=https://yanirs.github.io/tools/rls/api-species.json target=_blank rel=noopener>species.json</a> and <a href=https://yanirs.github.io/tools/rls/api-site-surveys.json target=_blank rel=noopener>site-surveys.json</a>). As part of the integration with the RLS website, the RLS developers will implement live versions of the files, which will get updated automatically. I&rsquo;ll add the links to the live versions when they become available. Please let me or the RLS team know if you find any issues with the data.</p><p>The approach I chose to produce the species counts in <code>site-surveys.json</code> doesn&rsquo;t take abundance into account, i.e., each species is counted once per survey regardless of the number of times it was seen on the survey. Ignoring abundance means that for sites with few surveys, the species count may not be a good indicator of future likelihood of occurrence. For example, some fish are solitary and seen rarely, while others occur in schools and are likely to be seen on every survey. However, this is less of an issue for sites with many surveys. In addition, this simple counting approach is easier to explain than some approaches that do account for abundance.</p><h2 id=implementation-details>Implementation details<a hidden class=anchor aria-hidden=true href=#implementation-details>#</a></h2><p>The source code for the tools can be found in my <a href=https://github.com/yanirs/yanirs.github.io/tree/master/tools/rls/src target=_blank rel=noopener>GitHub Pages repository</a>. Each tool is a simple single-page application, consisting of three files: <code>index.jade</code>, <code>main.coffee</code>, and <code>style.less</code>. In addition, the root source directory contains some common code in <code>common.less</code> and <code>util.coffee</code>, as well as configuration files for <a href=https://www.npmjs.com/ target=_blank rel=noopener>npm</a> and <a href=https://gruntjs.com/ target=_blank rel=noopener>Grunt</a>. Grunt is used to compile the source files from <a href=https://pugjs.org/ target=_blank rel=noopener>Jade/Pug</a>, <a href=http://coffeescript.org/ target=_blank rel=noopener>CoffeeScript</a>, and <a href=http://lesscss.org/ target=_blank rel=noopener>Less</a> to HTML, JS, and CSS respectively. These files are then served statically by <a href=https://pages.github.com/ target=_blank rel=noopener>GitHub Pages</a>.</p><p>The <a href=https://github.com/yanirs/yanirs.github.io/blob/master/tools/rls/src/util.coffee target=_blank rel=noopener>common CoffeeScript code</a> loads the JSONs asynchronously, and processes them into nested mappings that are easier to work with than arrays. In addition, the common code contains a method to summarise counts from multiple sites, by aggregating them as simple sums. This means that sites that are surveyed more frequently get weighted more heavily. For example, if a certain fish X was seen once in site A, twice in site B, and never in site C, its count across A, B, and C is <code>1 + 2 + 0 = 3</code>, but if A was surveyed once, B was surveyed twice, and C was surveyed seven times, X&rsquo;s aggregate frequency is <code>3 / (1 + 2 + 7) = 30%</code>. In the future, it may be worth normalising each site&rsquo;s species counts by the number of times the site was surveyed (making X&rsquo;s aggregate frequency <code>(1 / 1 + 2 / 2 + 0 / 7) / 3 = 66.67%</code>), but then rare species in rarely-surveyed sites may be overweighted.</p><p>The Frequency Explorer tool uses the Google Maps API to show a map with all the past survey sites. Users can select sites by drawing an area on the map, or by searching for site names in a <a href=https://select2.github.io/ target=_blank rel=noopener>Select2</a> box. The tool fails gracefully when Google Maps isn&rsquo;t available, which makes it possible to run it offline (assuming you have local copies of the species images). This was very useful on my last trip to the Coral Sea, where I was away from mobile reception for weeks. When sites are selected, the code generates a summary table of the species frequencies, which can be exported to a dynamically-generated CSV. In addition, users can choose to display images of all the species in the table. As this can trigger the download of thousands of images, I used <a href=https://www.npmjs.com/package/vanilla-lazyload target=_blank rel=noopener>vanilla-lazyload</a> to only load images when they enter the viewport. Finally, Frequency Explorer can also be used as a site selector for the Flashcards tool, as it contains a link to launch Flashcards with the set of selected sites (which is passed in the Flashcards query string).</p><p>The Flashcards tool relies on the excellent <a href=https://github.com/hakimel/reveal.js/ target=_blank rel=noopener>reveal.js</a> library to dynamically generate a presentation with a random subset of images of species that were recorded at the selected sites. The presentation consists of pairs of image and name slides – each image slide is followed by a slide where the name of the previously-shown animal is revealed. As I found that trying to memorise all the species at once is too hard, I added the ability to adjust the difficulty level of the flashcards by setting a frequency threshold (e.g., show only species that were recorded on 25% of surveys), or by focusing on observations from a single survey method (e.g., method 2 surveys in the tropics tend to be much less diverse than method 1 surveys). To avoid reloading the entire page when the settings change, the slides are regenerated dynamically. Reveal isn&rsquo;t really built to account for dynamic regeneration of slides, so I had to add a call to <code>Reveal.toggleOverview(false)</code> to get the cards to refresh correctly, but other than that it worked perfectly.</p><h2 id=future-work>Future work<a hidden class=anchor aria-hidden=true href=#future-work>#</a></h2><p>There are several possible extensions to the work done so far.</p><p>First, the integration of the tools into the RLS website is incomplete. They are still served in iframes from my GitHub Pages account, and the JSON data isn&rsquo;t updated automatically. Completing the integration is dependent on the RLS developers, who also have other priorities. Other RLS-dependent items include better optimisation of images (they&rsquo;re currently scaled down on the client side), and general performance improvements to the site.</p><p>Second, the tools themselves could be improved. For example, reliance on third-party libraries should be reduced (e.g., Frequency Explorer uses Bootstrap due to my limited design skills), and it&rsquo;d be nice if site selections were stored and read from the URL of Frequency Explorer (this is already done for Flashcards). In addition, as the tools are used to train new RLS divers, it&rsquo;d be useful to extend the Flashcards tool to run in test mode, where users would type in the names of the animals rather than just passively scroll through the presentation. This would make it possible to assess diver readiness to perform surveys based on their test scores.</p><p>Finally, many other interesting things can be done with the RLS data (in addition to producing <a href=http://reeflifesurvey.com/scientific-papers/ target=_blank rel=noopener>scientific papers and reports</a>, which is the main focus of the researchers behind the project). Examples include using the images to automate species identification (as discussed more thoroughly in my <a href=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/>previous post on the topic</a>), and building models to predict survey output and detect anomalies (e.g., due to climate change or other unusual factors). If you have other ideas, or end up playing with the data and coming with interesting results, please share your findings in the comments section.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/javascript/>JavaScript</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on x" href="https://x.com/intent/tweet/?text=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f&amp;hashtags=datascience%2cenvironment%2cJavaScript%2cmarinescience%2cReefLifeSurvey%2csoftwareengineering%2cwebdevelopment"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f&amp;title=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data&amp;summary=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data&amp;source=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f&title=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on whatsapp" href="https://api.whatsapp.com/send?text=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data%20-%20https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on telegram" href="https://telegram.me/share/url?text=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Exploring and visualising Reef Life Survey data on ycombinator" href="https://news.ycombinator.com/submitlink?t=Exploring%20and%20visualising%20Reef%20Life%20Survey%20data&u=https%3a%2f%2fyanirseroussi.com%2f2017%2f06%2f03%2fexploring-and-visualising-reef-life-survey-data%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/index.html b/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/index.html
index 9fe7e7e5c..1df379b0f 100644
--- a/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/index.html
+++ b/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Automattic,career,data science,Elasticsearch,personal,WordPress"><meta name=description content="I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="My 10-step path to becoming a remote data scientist with Automattic"><meta property="og:description" content="I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/"><meta property="og:image" content="https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2017-07-29T05:39:26+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road.jpg"><meta name=twitter:title content="My 10-step path to becoming a remote data scientist with Automattic"><meta name=twitter:description content="I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"My 10-step path to becoming a remote data scientist with Automattic","item":"https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"My 10-step path to becoming a remote data scientist with Automattic","name":"My 10-step path to becoming a remote data scientist with Automattic","description":"I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually.","keywords":["Automattic","career","data science","Elasticsearch","personal","WordPress"],"articleBody":"About two years ago, I read the book The Year without Pants, which describes the author’s experience leading a team at Automattic (the company behind WordPress.com, among other products). Automattic is a fully-distributed company, which means that all of its employees work remotely (hence pants are optional). While the book discusses some of the challenges of working remotely, the author’s general experience was very positive. A few months after reading the book, I decided to look for a full-time position after a period of independent work. Ideally, I wanted a well-paid data science-y remote job with an established distributed tech company that offers a good life balance and makes products I care about. Automattic seemed to tick all my boxes, so I decided to apply for a job with them. This post describes my application steps, which ultimately led to me becoming a data scientist with Automattic.\nBefore jumping in, it’s worth noting that this post describes my personal experience. If you apply for a job with Automattic, your experience is likely to be different, as the process varies across teams, and evolves over time.\n📧 Step 1: Do background research and apply I decided to apply for a data wrangler position with Automattic in October 2015. While data wrangler may sound less sexy than data scientist, reading the job ad led me to believe that the position may involve interesting data science work. This impression was strengthened by some LinkedIn stalking, which included finding current data wranglers and reading through their profiles and websites. I later found out that all the people on the data division start out as data wranglers, and then they may pick their own title. Some data wranglers do data science work, while others are more focused on data engineering, and there are some projects that require a broad range of skills. As the usefulness of the term data scientist is questionable, I’m not too fussed about fancy job titles. It’s more important to do interesting work in a supportive environment.\nApplying for the job was fairly straightforward. I simply followed the instructions from the ad:\nDoes this sound interesting? If yes, please send a short email to jobs @ this domain telling us about yourself and attach a resumé. Let us know what you can contribute to the team. Include the title of the position you’re applying for and your name in the subject. Proofread! Make sure you spell and capitalize WordPress and Automattic correctly. We are lucky to receive hundreds of applications for every position, so try to make your application stand out. If you apply for multiple positions or send multiple emails there will be one reply.\nHaving been on the receiving side of job applications, I find it surprising that many people don’t bother writing a cover letter, addressing the selection criteria in the ad, or even applying for a job they’re qualified to do. Hence, my cover letter was fairly short, comprising of several bullet points that highlight the similarities between the job requirements and my experience. It was nothing fancy, but simple cover letters have worked well for me in the past.\n⏳ Step 2: Wait patiently The initial application was followed by a long wait. From my research, this is the typical scenario. This is unsurprising, as Automattic is a fairly small company with a large footprint, which is both distributed and known as a great place to work (e.g., its Glassdoor rating is 4.9). Therefore, it attracts many applicants from all over the world, which take a while to process. In addition, Matt Mullenweg (Automattic’s CEO) reviews job applications before passing them on to the team leads.\nAs I didn’t know that Matt reviewed job applications, I decided to try to shorten the wait by getting introduced to someone in the data division. My first attempt was via a second-degree LinkedIn connection who works for Automattic. He responded quickly when I reached out to him, saying that his experience working with the company is in line with the Glassdoor reviews – it’s the best job he’s had in his 15-year-long career. However, he couldn’t help me with an intro, because there is no simple way around Automattic’s internal processes. Nonetheless, he reassured me that it is worth waiting patiently, as the strict process means that you end up working with great people.\nI wasn’t in a huge rush to find a job, but in December 2015 I decided to accept an offer to become the head of data science at Car Next Door. This was a good decision at the time, as I believe in the company’s original vision of reducing the number of cars on the road through car sharing, and it seemed like there would be many interesting projects for me to work on. The position wasn’t completely remote, but as the company was already spread across several cities, I was able to work from home for a day or two every week. In addition, it was a pleasant commute by bike from my Sydney home to the office, so putting the fully-remote job search on hold didn’t seem like a major sacrifice. As I haven’t heard anything from Automattic at that stage, it seemed unwise to reject a good offer, so I started working full-time with Car Next Door in January 2016.\nI successfully attracted Automattic’s attention with a post I published on the misuse of the word insights by many tech companies, which included an example from WordPress.com. Greg Ichneumon Brown, one of the data wranglers, commented on the post, and invited me to apply to join Automattic and help them address the issues I raised. This happened after I accepted the offer from Car Next Door, and hasn’t resulted in any speed up of the process, so I just gave up on Automattic and carried on with my life.\n💬 Step 3: Chat with the data lead I finally heard back from Automattic in February 2016 (four months after my initial application and a month into my employment with Car Next Door). Martin Remy, who leads the data division, emailed me to enquire if I’m still interested in the position. I informed him that I was no longer looking for a job, but we agreed to have an informal chat, as I’ve been waiting for such a long time.\nAs is often the case with Automattic interviews, the chat with Martin was completely text-based. Working with a distributed team means that voice and video calls can be hard to schedule. Hence, Automattic relies heavily on textual channels, and text-based interviews allow the company to test the written communication skills of candidates. The chat revolved around my past work experience, and Martin also took the time to answer my questions about the company and the data division. At the conclusion of the chat, Martin suggested I contact him directly if I was ever interested in continuing the application process. While I was happy with my position at the time, the chat strengthened my positive impression of Automattic, and I decided that I would reapply if I were to look for a full-time position again.\nMy next job search started earlier than I had anticipated. In October 2016, I decided to leave Car Next Door due to disagreements with the founders over the general direction of the company. In addition, I had more flexibility in choosing where to live, as my personal circumstances had changed. As I’ve always been curious about life outside the capital cities of Australia, I wanted to move away from Sydney. While I could have probably continued working remotely with Car Next Door, I felt that it would be better to find a job with a fully-distributed team. Therefore, I messaged Martin and we scheduled another chat.\nThe second chat with Martin took place in early November. Similarly to the first chat, it was conducted via Skype text messages, and revolved around my work in the time that has passed since the first chat. This time, as I was keen on continuing with the process, I asked more specific questions about what kind of work I’m likely to end up doing and what the next steps would be. The answers were that I’d be joining the data science team, and that the next steps are a pre-trial test, a paid trial, and a final interview with Matt. While this sounds straightforward, it took another six months until I finally became an Automattic employee (but I wasn’t in a rush).\n☑️ Step 4: Pass the pre-trial test The pre-trial test consisted of a data analysis task, where I was given a dataset and a set of questions to answer by Carly Stambaugh, the data science lead. The goal of the test is to evaluate the candidate’s approach to a problem, and assess organisational and communication skills. As such, the focus isn’t on obtaining a specific result, so candidates are given a choice of several potential avenues to explore. The open-ended nature of the task is reminiscent of many real-world data science projects, where you don’t always have a clear idea of what you’re going to discover. While some people may find this kind of uncertainty daunting, I find it interesting, as it is one of the things that makes data science a science.\nI spent a few days analysing the data and preparing a report, which was submitted as a Jupyter Notebook. After submitting my initial report, there were a few follow-up questions, which I answered by email. The report was reviewed by Carly and Martin, and as they were satisfied with my work, I was invited to proceed to the next stage: A paid trial project.\n👨‍💻 Step 5: Do the trial project The main part of the application process with Automattic is the paid trial project. The rationale behind doing paid trials was explained a few years ago by Matt in Hire by Auditions, Not Resumes:\nBefore we hire anyone, they go through a trial process first, on contract. They can do the work at night or over the weekend, so they don’t have to leave their current job in the meantime. We pay a standard rate of $25 per hour, regardless of whether you’re applying to be an engineer or the chief financial officer.\nDuring the trials, we give the applicants actual work. If you’re applying to work in customer support, you’ll answer tickets. If you’re an engineer, you’ll work on engineering problems. If you’re a designer, you’ll design.\nThere’s nothing like being in the trenches with someone, working with them day by day. It tells you something you can’t learn from resumes, interviews, or reference checks. At the end of the trial, everyone involved has a great sense of whether they want to work together going forward. And, yes, that means everyone — it’s a mutual tryout. Some people decide we’re not the right fit for them.\nThe goal of my trial project was to improve the Elasticsearch language detection algorithm. This took about a month, and ultimately resulted in a pull request that got merged into the language detection plugin. I find this aspect of the process pretty exciting: While the plugin is used to classify millions of documents internally by Automattic, its impact extends beyond the company, as Elasticsearch is used by many other organisations and projects. This stands in contrast to many other technical job interviews, which consist of unpaid work on toy problems under stressful conditions, where the work performed is ultimately thrown away. While the monetary compensation for the trial work is lower than the market rate for data science consulting, I valued the opportunity to work on a real open source project, even if this hadn’t led to me getting hired.\nThere was much more to the trial project than what’s shown in the final pull request. Most of the discussions were held on an internal project thread, primarly under the guidance of Carly (the data science lead), and Greg (the data wrangler who replied to my post a year earlier). The project was kicked off with a general problem statement: There was some evidence that the Elasticsearch language detection plugin doesn’t perform well on short texts, and my mission was to improve it. As the plugin didn’t include any tests for short texts, one of the main contributions of my work was the creation of datasets and tests to measure its accuracy on texts of different lengths. This was followed by some tweaks that improved the plugin’s performance, as summarised in the pull request. Internally, this work consisted of several iterations where I came up with ideas, asked questions, implemented the ideas, shared the results, and discussed further steps. There are still many possible improvements to the work done in the trial. However, as trials generally last around a month, we decided to end it after a few iterations.\nI enjoyed the trial process, but it is definitely not for everyone. Most notably, there is a strong emphasis on asynchronous text-based communication, which is the main mode by which projects are coordinated at Automattic. People who don’t enjoy written communication may find this aspect challenging, but I have always found that writing helps me organise my thoughts, and that I retain information better when reading than when listening to people speak. That being said, Automatticians do meet in person several times a year, and some teams have video chats for some discussions. While doing the trial, I had a video chat with Carly, which was the first (and last) time in the process that I got to see and hear a live human. However, this was not an essential part of the trial project, as our chat was mostly on the data scientist role and my job expectations.\n⏳ Step 6: Wait patiently I finished working on the trial project just before Christmas. The feedback I received throughout the trial was positive, but Martin, Carly, and Greg had to go through the work and discuss it among themselves before making a final decision. This took about a month, due to the holiday period, various personal circumstances, and the data science team meetup that was scheduled for January 2017. Eventually, Martin got back to me with positive news: They were satisfied with my trial work, which meant there was only one stage left – the final interview with Matt Mullenweg, Automattic’s CEO.\n👉 Step 7: Ping Matt Like other parts of the process, the interview with Matt is text-based. The way it works is fairly simple: I was instructed to message Matt on Slack and wait for a response, which may take days or weeks. I sent Matt a message on January 25, and was surprised to hear back from him the following morning. However, that day was Australia Day, which is a public holiday here. Therefore, I only got back to him two hours after he messaged me that morning, and by that time he was probably already busy with other things. This was the start of a pretty long wait.\n⏳ Step 8: Wait patiently I left Car Next Door at the end of January, as I figured that I would be able to line up some other work even if things didn’t work out with Automattic. My plan was to take some time off, and then move up to the Northern Rivers area of New South Wales. I had two Reef Life Survey trips planned, so I wasn’t going to start working again before mid-April. I assumed that I would hear back from Matt before then, which would have allowed me to make an informed decision whether to look for another job or not.\nAfter two weeks of waiting, the time for my dive trips was nearing. As I was going to be without mobile reception for a while, I thought it’d be worth letting Matt know my schedule. After discussing the matter with Martin, I messaged Matt. He responded, saying that we might as well do the interview at the beginning of April, as I won’t be starting work before that time anyway. I would have preferred to be done with the interview earlier, but was happy to have some certainty and not worry about missing more chat messages before April.\nIn early April, I returned from my second dive trip (which included a close encounter with Cyclone Debbie), and was hoping to sort out my remote work situation while completing the move up north. Unfortunately, while the move was successful, I was ready to give up on Automattic because I haven’t heard back from Matt at all in April. However, Martin remained optimistic and encouraged me to wait patiently, which I did as I was pretty busy with the move and with some casual freelancing projects.\n💬 Step 9: Chat with Matt and accept the job offer The chat with Matt finally happened on May 2. As is often the case, it took a few hours and covered my background, the trial process, and some other general questions. I asked him about my long wait for the final chat, and he apologised for me being an outlier, as most chats happen within two weeks of a candidate being passed over to him. As the chat was about to conclude, we got to the topic of salary negotiation (which went well), and then the process was finally over! Within a few hours of the chat I was sent an offer letter and an employment contract. As Automattic has an entity in Australia (called Ausomattic), it’s a fairly standard contract. I signed the contract and started work the following week – over a year and a half after my initial application. Even before I started working, I booked tickets to meet the data division in Montréal – a fairly swift transition from the long wait for the final interview.\n🎉 Step 10: Start working and choose a job title As noted above, Automatticians get to choose their own job titles, so to become a data scientist with Automattic, I had to set my job title to Data Scientist. This is generally how many people become data scientists these days, even outside Automattic. However, job titles don’t matter as much as job satisfaction. And after 2.5 months with Automattic, I’m very satisfied with my decision to join the company. My first three weeks were spent doing customer support, like all new Automattic employees. Since then, I’ve been involved in projects to make engagement measurement more consistent (harder than it sounds, as counting things is hard), and to improve the data science codebase (e.g., moving away from Legacy Python). Besides that, I also went to Montréal for the data division meetup, and have started getting into chatbot work. I’m looking forward to doing more work and sharing my experience here and on data.blog.\n","wordCount":"3143","inLanguage":"en","image":"https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road.jpg","datePublished":"2017-07-29T05:39:26Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">My 10-step path to becoming a remote data scientist with Automattic</h1><div class=post-meta><span title='2017-07-29 05:39:26 +0000 UTC'>July 29, 2017</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road_hu690f2353847db52b435aef42e177b9ac_842409_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road_hu690f2353847db52b435aef42e177b9ac_842409_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road_hu690f2353847db52b435aef42e177b9ac_842409_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road_hu690f2353847db52b435aef42e177b9ac_842409_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road_hu690f2353847db52b435aef42e177b9ac_842409_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road.jpg 2000w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/long-remote-road.jpg alt width=2000 height=1125></figure><div class=post-content><p>About two years ago, I read the book <a href=http://scottberkun.com/yearwithoutpants/ target=_blank rel=noopener>The Year without Pants</a>, which describes the author&rsquo;s experience leading a team at <a href=https://automattic.com/ target=_blank rel=noopener>Automattic</a> (the company behind WordPress.com, among other products). Automattic is a fully-distributed company, which means that all of its employees work remotely (hence pants are optional). While the book discusses some of the challenges of working remotely, the author&rsquo;s general experience was very positive. A few months after reading the book, I decided to look for a full-time position after <a href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/>a period of independent work</a>. Ideally, I wanted a well-paid data science-y remote job with an established distributed tech company that offers a good life balance and makes products I care about. Automattic seemed to tick all my boxes, so I decided to apply for a job with them. This post describes my application steps, which ultimately led to me becoming a data scientist with Automattic.</p><p>Before jumping in, it&rsquo;s worth noting that this post describes <em>my</em> personal experience. If you apply for a job with Automattic, your experience is likely to be different, as the process varies across teams, and evolves over time.</p><h2 id=-step-1-do-background-research-and-apply>📧 Step 1: Do background research and apply<a hidden class=anchor aria-hidden=true href=#-step-1-do-background-research-and-apply>#</a></h2><p>I decided to apply for a data wrangler position with Automattic in October 2015. While data <em>wrangler</em> may sound less sexy than data <em>scientist</em>, reading the <a href=http://web.archive.org/web/20150908140923/https://automattic.com/work-with-us/data-wrangler/ target=_blank rel=noopener>job ad</a> led me to believe that the position may involve interesting data science work. This impression was strengthened by some LinkedIn stalking, which included finding current data wranglers and reading through their profiles and websites. I later found out that all the people on the data division start out as data wranglers, and then they may pick their own title. Some data wranglers do data science work, while others are more focused on data engineering, and there are some projects that require a broad range of skills. As <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>the usefulness of the term <em>data scientist</em> is questionable</a>, I&rsquo;m not too fussed about fancy job titles. It&rsquo;s more important to do interesting work in a supportive environment.</p><p>Applying for the job was fairly straightforward. I simply followed the instructions from the ad:</p><blockquote><p>Does this sound interesting? If yes, please send a short email to jobs @ this domain telling us about yourself and attach a resumé. Let us know what you can contribute to the team. Include the title of the position you&rsquo;re applying for and your name in the subject. Proofread! Make sure you spell and capitalize WordPress and Automattic correctly. We are lucky to receive hundreds of applications for every position, so try to make your application stand out. If you apply for multiple positions or send multiple emails there will be one reply.</p></blockquote><p>Having been on the receiving side of job applications, I find it surprising that many people don&rsquo;t bother writing a cover letter, addressing the selection criteria in the ad, or even applying for a job they&rsquo;re qualified to do. Hence, my cover letter was fairly short, comprising of several bullet points that highlight the similarities between the job requirements and my experience. It was nothing fancy, but simple cover letters have worked well for me in the past.</p><h2 id=-step-2-wait-patiently>⏳ Step 2: Wait patiently<a hidden class=anchor aria-hidden=true href=#-step-2-wait-patiently>#</a></h2><p>The initial application was followed by a long wait. From my research, this is the typical scenario. This is unsurprising, as <a href=https://automattic.com/about/ target=_blank rel=noopener>Automattic is a fairly small company with a large footprint</a>, which is both distributed and known as a great place to work (e.g., its <a href=https://www.glassdoor.com.au/Reviews/Automattic-Reviews-E751107.htm target=_blank rel=noopener>Glassdoor rating is 4.9</a>). Therefore, it attracts many applicants from all over the world, which take a while to process. In addition, <a href=http://davemart.in/remote-hiring/ target=_blank rel=noopener>Matt Mullenweg (Automattic&rsquo;s CEO) reviews job applications before passing them on to the team leads</a>.</p><p>As I didn&rsquo;t know that Matt reviewed job applications, I decided to try to shorten the wait by getting introduced to someone in the data division. My first attempt was via a second-degree LinkedIn connection who works for Automattic. He responded quickly when I reached out to him, saying that his experience working with the company is in line with the Glassdoor reviews – it&rsquo;s the best job he&rsquo;s had in his 15-year-long career. However, he couldn&rsquo;t help me with an intro, because there is no simple way around Automattic&rsquo;s internal processes. Nonetheless, he reassured me that it is worth waiting patiently, as the strict process means that you end up working with great people.</p><p>I wasn&rsquo;t in a huge rush to find a job, but in December 2015 I decided to accept an offer to become the head of data science at <a href=https://www.carnextdoor.com.au/ target=_blank rel=noopener>Car Next Door</a>. This was a good decision at the time, as I believe in the company&rsquo;s original vision of reducing the number of cars on the road through car sharing, and it seemed like there would be many interesting projects for me to work on. The position wasn&rsquo;t completely remote, but as the company was already spread across several cities, I was able to work from home for a day or two every week. In addition, it was a pleasant commute by bike from my Sydney home to the office, so putting the fully-remote job search on hold didn&rsquo;t seem like a major sacrifice. As I haven&rsquo;t heard anything from Automattic at that stage, it seemed unwise to reject a good offer, so I started working full-time with Car Next Door in January 2016.</p><p>I successfully attracted Automattic&rsquo;s attention with <a href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/>a post I published on the misuse of the word <em>insights</em> by many tech companies</a>, which included an example from WordPress.com. Greg Ichneumon Brown, one of the data wranglers, <a href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/#comment-957>commented on the post</a>, and invited me to apply to join Automattic and help them address the issues I raised. This happened after I accepted the offer from Car Next Door, and hasn&rsquo;t resulted in any speed up of the process, so I just gave up on Automattic and carried on with my life.</p><h2 id=-step-3-chat-with-the-data-lead>💬 Step 3: Chat with the data lead<a hidden class=anchor aria-hidden=true href=#-step-3-chat-with-the-data-lead>#</a></h2><p>I finally heard back from Automattic in February 2016 (four months after my initial application and a month into my employment with Car Next Door). Martin Remy, who leads the data division, emailed me to enquire if I&rsquo;m still interested in the position. I informed him that I was no longer looking for a job, but we agreed to have an informal chat, as I&rsquo;ve been waiting for such a long time.</p><p>As is often the case with Automattic interviews, the chat with Martin was completely text-based. Working with a distributed team means that voice and video calls can be hard to schedule. Hence, Automattic relies heavily on textual channels, and text-based interviews allow the company to test the written communication skills of candidates. The chat revolved around my past work experience, and Martin also took the time to answer my questions about the company and the data division. At the conclusion of the chat, Martin suggested I contact him directly if I was ever interested in continuing the application process. While I was happy with my position at the time, the chat strengthened my positive impression of Automattic, and I decided that I would reapply if I were to look for a full-time position again.</p><p>My next job search started earlier than I had anticipated. In October 2016, I decided to leave Car Next Door due to disagreements with the founders over the general direction of the company. In addition, I had more flexibility in choosing where to live, as my personal circumstances had changed. As I&rsquo;ve always been curious about life outside the capital cities of Australia, I wanted to move away from Sydney. While I could have probably continued working remotely with Car Next Door, I felt that it would be better to find a job with a fully-distributed team. Therefore, I messaged Martin and we scheduled another chat.</p><p>The second chat with Martin took place in early November. Similarly to the first chat, it was conducted via Skype text messages, and revolved around my work in the time that has passed since the first chat. This time, as I was keen on continuing with the process, I asked more specific questions about what kind of work I&rsquo;m likely to end up doing and what the next steps would be. The answers were that I&rsquo;d be joining the data science team, and that the next steps are a pre-trial test, a paid trial, and a final interview with Matt. While this sounds straightforward, it took another six months until I finally became an Automattic employee (but I wasn&rsquo;t in a rush).</p><h2 id=-step-4-pass-the-pre-trial-test>☑️ Step 4: Pass the pre-trial test<a hidden class=anchor aria-hidden=true href=#-step-4-pass-the-pre-trial-test>#</a></h2><p>The pre-trial test consisted of a data analysis task, where I was given a dataset and a set of questions to answer by Carly Stambaugh, the data science lead. The goal of the test is to evaluate the candidate&rsquo;s approach to a problem, and assess organisational and communication skills. As such, the focus isn&rsquo;t on obtaining a specific result, so candidates are given a choice of several potential avenues to explore. The open-ended nature of the task is reminiscent of many real-world data science projects, where you don&rsquo;t always have a clear idea of what you&rsquo;re going to discover. While some people may find this kind of uncertainty daunting, I find it interesting, as it is one of the things that makes data science a <em>science</em>.</p><p>I spent a few days analysing the data and preparing a report, which was submitted as a <a href=http://jupyter.org/ target=_blank rel=noopener>Jupyter Notebook</a>. After submitting my initial report, there were a few follow-up questions, which I answered by email. The report was reviewed by Carly and Martin, and as they were satisfied with my work, I was invited to proceed to the next stage: A paid trial project.</p><h2 id=-step-5-do-the-trial-project>👨‍💻 Step 5: Do the trial project<a hidden class=anchor aria-hidden=true href=#-step-5-do-the-trial-project>#</a></h2><p>The main part of the application process with Automattic is the paid trial project. The rationale behind doing paid trials was explained a few years ago by Matt in <a href=https://hbr.org/2014/01/hire-by-auditions-not-resumes target=_blank rel=noopener>Hire by Auditions, Not Resumes</a>:</p><blockquote><p>Before we hire anyone, they go through a trial process first, on contract. They can do the work at night or over the weekend, so they don&rsquo;t have to leave their current job in the meantime. We pay a standard rate of $25 per hour, regardless of whether you&rsquo;re applying to be an engineer or the chief financial officer.</p><p>During the trials, we give the applicants actual work. If you&rsquo;re applying to work in customer support, you&rsquo;ll answer tickets. If you&rsquo;re an engineer, you&rsquo;ll work on engineering problems. If you&rsquo;re a designer, you&rsquo;ll design.</p><p>There&rsquo;s nothing like being in the trenches with someone, working with them day by day. It tells you something you can&rsquo;t learn from resumes, interviews, or reference checks. At the end of the trial, everyone involved has a great sense of whether they want to work together going forward. And, yes, that means everyone — it&rsquo;s a mutual tryout. Some people decide we&rsquo;re not the right fit for them.</p></blockquote><p>The goal of my trial project was to improve the <a href=https://www.elastic.co/products/elasticsearch target=_blank rel=noopener>Elasticsearch</a> language detection algorithm. This took about a month, and ultimately resulted in <a href=https://github.com/jprante/elasticsearch-langdetect/pull/69 target=_blank rel=noopener>a pull request that got merged into the language detection plugin</a>. I find this aspect of the process pretty exciting: While the plugin is used to classify millions of documents internally by Automattic, its impact extends beyond the company, as Elasticsearch is used by many other organisations and projects. This stands in contrast to many other technical job interviews, which consist of unpaid work on toy problems under stressful conditions, where the work performed is ultimately thrown away. While the monetary compensation for the trial work is lower than the market rate for data science consulting, I valued the opportunity to work on a real open source project, even if this hadn&rsquo;t led to me getting hired.</p><p>There was much more to the trial project than what&rsquo;s shown in the final pull request. Most of the discussions were held on an internal project thread, primarly under the guidance of Carly (the data science lead), and Greg (the data wrangler who replied to my post a year earlier). The project was kicked off with a general problem statement: There was some evidence that the Elasticsearch language detection plugin doesn&rsquo;t perform well on short texts, and my mission was to improve it. As the plugin didn&rsquo;t include any tests for short texts, one of the main contributions of my work was the creation of datasets and tests to measure its accuracy on texts of different lengths. This was followed by some tweaks that improved the plugin&rsquo;s performance, as <a href=https://github.com/jprante/elasticsearch-langdetect/pull/69 target=_blank rel=noopener>summarised in the pull request</a>. Internally, this work consisted of several iterations where I came up with ideas, asked questions, implemented the ideas, shared the results, and discussed further steps. There are still many possible improvements to the work done in the trial. However, as trials generally last around a month, we decided to end it after a few iterations.</p><p>I enjoyed the trial process, but it is definitely not for everyone. Most notably, there is a strong emphasis on asynchronous text-based communication, which is the main mode by which projects are coordinated at Automattic. People who don&rsquo;t enjoy written communication may find this aspect challenging, but I have always found that writing helps me organise my thoughts, and that I retain information better when reading than when listening to people speak. That being said, Automatticians do meet in person several times a year, and some teams have video chats for some discussions. While doing the trial, I had a video chat with Carly, which was the first (and last) time in the process that I got to see and hear a live human. However, this was not an essential part of the trial project, as our chat was mostly on the data scientist role and my job expectations.</p><h2 id=-step-6-wait-patiently>⏳ Step 6: Wait patiently<a hidden class=anchor aria-hidden=true href=#-step-6-wait-patiently>#</a></h2><p>I finished working on the trial project just before Christmas. The feedback I received throughout the trial was positive, but Martin, Carly, and Greg had to go through the work and discuss it among themselves before making a final decision. This took about a month, due to the holiday period, various personal circumstances, and the data science team meetup that was scheduled for January 2017. Eventually, Martin got back to me with positive news: They were satisfied with my trial work, which meant there was only one stage left – the final interview with Matt Mullenweg, Automattic&rsquo;s CEO.</p><h2 id=-step-7-ping-matt>👉 Step 7: Ping Matt<a hidden class=anchor aria-hidden=true href=#-step-7-ping-matt>#</a></h2><p>Like other parts of the process, the interview with Matt is text-based. The way it works is fairly simple: I was instructed to message Matt on Slack and wait for a response, which may take days or weeks. I sent Matt a message on January 25, and was surprised to hear back from him the following morning. However, that day was Australia Day, which is a public holiday here. Therefore, I only got back to him two hours after he messaged me that morning, and by that time he was probably already busy with other things. This was the start of a pretty long wait.</p><h2 id=-step-8-wait-patiently>⏳ Step 8: Wait patiently<a hidden class=anchor aria-hidden=true href=#-step-8-wait-patiently>#</a></h2><p>I left Car Next Door at the end of January, as I figured that I would be able to line up some other work even if things didn&rsquo;t work out with Automattic. My plan was to take some time off, and then move up to the Northern Rivers area of New South Wales. I had two <a href=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/>Reef Life Survey trips</a> planned, so I wasn&rsquo;t going to start working again before mid-April. I assumed that I would hear back from Matt before then, which would have allowed me to make an informed decision whether to look for another job or not.</p><p>After two weeks of waiting, the time for my dive trips was nearing. As I was going to be without mobile reception for a while, I thought it&rsquo;d be worth letting Matt know my schedule. After discussing the matter with Martin, I messaged Matt. He responded, saying that we might as well do the interview at the beginning of April, as I won&rsquo;t be starting work before that time anyway. I would have preferred to be done with the interview earlier, but was happy to have some certainty and not worry about missing more chat messages before April.</p><p>In early April, I returned from my second dive trip (which included <a href=https://www.whitsundaytimes.com.au/news/boat-caught-in-eye-of-cyclone-cruises-home/3164170/ target=_blank rel=noopener>a close encounter with Cyclone Debbie</a>), and was hoping to sort out my remote work situation while completing the move up north. Unfortunately, while the move was successful, I was ready to give up on Automattic because I haven&rsquo;t heard back from Matt at all in April. However, Martin remained optimistic and encouraged me to wait patiently, which I did as I was pretty busy with the move and with some casual freelancing projects.</p><h2 id=-step-9-chat-with-matt-and-accept-the-job-offer>💬 Step 9: Chat with Matt and accept the job offer<a hidden class=anchor aria-hidden=true href=#-step-9-chat-with-matt-and-accept-the-job-offer>#</a></h2><p>The chat with Matt finally happened on May 2. As is often the case, it took a few hours and covered my background, the trial process, and some other general questions. I asked him about my long wait for the final chat, and he apologised for me being an outlier, as most chats happen within two weeks of a candidate being passed over to him. As the chat was about to conclude, we got to the topic of salary negotiation (which went well), and then the process was finally over! Within a few hours of the chat I was sent an offer letter and an employment contract. As Automattic has an entity in Australia (called Ausomattic), it&rsquo;s a fairly standard contract. I signed the contract and started work the following week – over a year and a half after my initial application. Even before I started working, I booked tickets to <a href=https://data.blog/2017/06/29/data-coalesce-automattic-data-division-meets-in-montreal/ target=_blank rel=noopener>meet the data division in Montréal</a> – a fairly swift transition from the long wait for the final interview.</p><h2 id=-step-10-start-working-and-choose-a-job-title>🎉 Step 10: Start working and choose a job title<a hidden class=anchor aria-hidden=true href=#-step-10-start-working-and-choose-a-job-title>#</a></h2><p>As noted above, Automatticians get to choose their own job titles, so to become a data scientist with Automattic, I had to set my job title to Data Scientist. This is generally how many people become data scientists these days, even outside Automattic. However, job titles don&rsquo;t matter as much as job satisfaction. And after 2.5 months with Automattic, I&rsquo;m very satisfied with my decision to join the company. My first three weeks were spent doing customer support, like all new Automattic employees. Since then, I&rsquo;ve been involved in projects to make engagement measurement more consistent (harder than it sounds, as <a href=http://daynebatten.com/2016/06/counting-hard-data-science/ target=_blank rel=noopener>counting things is hard</a>), and to improve the data science codebase (e.g., moving away from <a href=http://powerfulpython.com/blog/magic-word-legacy-python/ target=_blank rel=noopener>Legacy Python</a>). Besides that, I also went to Montréal for the data division meetup, and have started getting into <a href=https://data.blog/2017/05/24/may-the-bot-be-with-you-how-algorithms-are-supporting-happiness-at-wordpress-com/ target=_blank rel=noopener>chatbot work</a>. I&rsquo;m looking forward to doing more work and sharing my experience here and on <a href=https://data.blog/ target=_blank rel=noopener>data.blog</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/automattic/>Automattic</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/elasticsearch/>Elasticsearch</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/wordpress/>WordPress</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on x" href="https://x.com/intent/tweet/?text=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f&amp;hashtags=Automattic%2ccareer%2cdatascience%2cElasticsearch%2cpersonal%2cWordPress"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f&amp;title=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic&amp;summary=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic&amp;source=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f&title=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on whatsapp" href="https://api.whatsapp.com/send?text=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic%20-%20https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on telegram" href="https://telegram.me/share/url?text=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My 10-step path to becoming a remote data scientist with Automattic on ycombinator" href="https://news.ycombinator.com/submitlink?t=My%2010-step%20path%20to%20becoming%20a%20remote%20data%20scientist%20with%20Automattic&u=https%3a%2f%2fyanirseroussi.com%2f2017%2f07%2f29%2fmy-10-step-path-to-becoming-a-remote-data-scientist-with-automattic%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1698><div class=comment-header><a href=#comment-1698><img class=comment-avatar src="https://www.gravatar.com/avatar/42214b1b56b1978983775143bd01a434?s=50"><p class=comment-info><strong>Jonathan Wood</strong><br><small>2017-07-29 18:26:09</small></p></a></div><div class="comment-body post-content">Very enlightening post! It was very awesome to see that the insights you saw to Elasticsearch went to a PR. I bet that was worth the whole thing!</div></div><div class=comment-level-0 id=comment-1700><div class=comment-header><a href=#comment-1700><img class=comment-avatar src="https://www.gravatar.com/avatar/e76f462de89e67d4874d555b8cceb936?s=50"><p class=comment-info><strong>Mostafa</strong><br><small>2017-07-30 09:33:44</small></p></a></div><div class="comment-body post-content">That&rsquo;s very exciting, I wanted to ask are you a self learner or do you have a degree,can you please share your background.
 Thank you</div></div><div class=comment-level-1 id=comment-1705><div class=comment-header><a href=#comment-1705><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2017-07-31 01:37:04</small></p></a></div><div class="comment-body post-content">Thanks Mostafa. Yes, I have a BSc in computer science, and a PhD in what you would now call data science. See: <a href=https://www.linkedin.com/in/yanirseroussi/ target=_blank rel=noopener>https://www.linkedin.com/in/yanirseroussi/</a></div></div><div class=comment-level-0 id=comment-1965><div class=comment-header><a href=#comment-1965><img class=comment-avatar src="https://www.gravatar.com/avatar/5f1a3858a1d36ac1a2d19f194c6308ae?s=50"><p class=comment-info><strong>Pravin Singh</strong><br><small>2017-11-23 08:51:50</small></p></a></div><div class="comment-body post-content"><p>This was an amazing post, Yanir! Loved the breakdown and the patience you had for the whole process, very well played and you really deserved it! :)</p><p>P.S: Really can connect as I&rsquo;ve been working independently for a while now and would definitely be open to looking for long-term contracts or remote jobs like this.</p></div></div><div class=comment-level-0 id=comment-3121><div class=comment-header><a href=#comment-3121><img class=comment-avatar src="https://www.gravatar.com/avatar/0bf0a218fbad4f6450ecc66c2d91a714?s=50"><p class=comment-info><strong>Baker</strong><br><small>2018-12-23 21:57:33</small></p></a></div><div class="comment-body post-content">Your post is really a therapy to most people who apply for jobs and loose hope of waiting. I believe patience is a key to everything. Thqnks</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2017/09/02/state-of-bandcamp-recommender/index.html b/2017/09/02/state-of-bandcamp-recommender/index.html
index abc59e20d..cace06669 100644
--- a/2017/09/02/state-of-bandcamp-recommender/index.html
+++ b/2017/09/02/state-of-bandcamp-recommender/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Bandcamp,BCRecommender"><meta name=description content="Call for BCRecommender maintainers followed by a decision to shut it down, as I don&rsquo;t have enough time and Bandcamp now offers recommendations."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="State of Bandcamp Recommender, Late 2017"><meta property="og:description" content="Call for BCRecommender maintainers followed by a decision to shut it down, as I don&rsquo;t have enough time and Bandcamp now offers recommendations."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/"><meta property="og:image" content="https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2017-09-02T10:19:02+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage.jpg"><meta name=twitter:title content="State of Bandcamp Recommender, Late 2017"><meta name=twitter:description content="Call for BCRecommender maintainers followed by a decision to shut it down, as I don&rsquo;t have enough time and Bandcamp now offers recommendations."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"State of Bandcamp Recommender, Late 2017","item":"https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"State of Bandcamp Recommender, Late 2017","name":"State of Bandcamp Recommender, Late 2017","description":"Call for BCRecommender maintainers followed by a decision to shut it down, as I don\u0026rsquo;t have enough time and Bandcamp now offers recommendations.","keywords":["Bandcamp","BCRecommender"],"articleBody":"November 2017: Update and goodbye I’ve decided to shut down Bandcamp Recommender (BCRecommender), despite hearing back from a few volunteers. The main reasons are:\nBandcamp now shows album recommendations at the bottom of album pages. While this isn’t quite the same as BCRecommender, I hope that it will evolve to a more comprehensive recommender system. I tried to contact Bandcamp to get their support for the continued running of BCRecommender. I have not heard back from them. It would have been nice to receive some acknowledgement that they find BCRecommender useful. As discussed below, I don’t have much time to spend on the project, and handing it off to other maintainers would have been time-consuming. Given reasons 1 and 2, I don’t feel like it’s worth the effort. Thanks to everyone who’s contacted me – you’re awesome! September 2017: Original announcement I released the first version of Bandcamp Recommender (BCRecommender) about three years ago, with the main goal of surfacing music recommendations from Bandcamp. A secondary goal was learning more about building and marketing a standalone web app. As such, I shared a few posts about BCRecommender over the years:\nInitial posts on the motivation behind building BCRecommender, original system layout, and the recommendation engine. Marketing-oriented posts on applying the Traction Book’s framework to BCRecommender, followed by an update on traction successes and failures, and another post on finding some SEO success. Later architectural changes, including moving away from Parse.com and migrating from MongoDB to Elasticsearch. The last of the above posts was published in November 2015 – almost two years ago. Most of the work on BCRecommender was done up to that point, when my main focus was on part-time contracting while working on my own projects. However, since January 2016 I’ve mostly been working full-time, so I haven’t had the time to give enough attention to the project. Therefore, it looks like it’s time for me to say goodbye to BCRecommender.\nDespite the lack of attention, about 5,000 people still visit BCRecommender every month (down from a peak of around 9,000). I know that people find it useful, even though it hasn’t been functionally updated in a long time (though the recommendations have been refreshed a few times). In an ideal world, BCRecommender would be replaced by algorithmic recommendations from Bandcamp. But unfortunately, Bandcamp still doesn’t offer personalised recommendations. This is a shame, because such recommendations could be of great benefit to both artists and fans. Millions of tracks and albums have been published on Bandcamp, meaning that serving personalised recommendations that cover their full catalogue can only be achieved using algorithms. However, it seems like they’re not interested in building this kind of functionality.\nRather than simply pulling the plug on BCRecommender, I thought I’d put a call out to see if anyone is interested in maintaining it. I’m happy to open source the code and hand the project over to someone else if it means it would be in good hands. With a little bit of work, BCRecommender can be turned into a full Bandcamp-based personalised radio station. If you think you’d be a good fit for maintaining the project, drop me a line and we can discuss further. If you just love BCRecommender, you can also let Bandcamp know that you want them to implement algorithmic recommendations (e.g., on Twitter or by emailing support@bandcamp.com). I’ll keep BCRecommender alive for about two more months and see if I get any responses. Either way, I’ll be saying goodbye to maintaining it before the end of the year.\n","wordCount":"590","inLanguage":"en","image":"https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage.jpg","datePublished":"2017-09-02T10:19:02Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">State of Bandcamp Recommender, Late 2017</h1><div class=post-meta><span title='2017-09-02 10:19:02 +0000 UTC'>September 2, 2017</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage_hu4bb4a83eb29302b814ecd8f57b6ac5b4_276563_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage_hu4bb4a83eb29302b814ecd8f57b6ac5b4_276563_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage_hu4bb4a83eb29302b814ecd8f57b6ac5b4_276563_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage_hu4bb4a83eb29302b814ecd8f57b6ac5b4_276563_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage_hu4bb4a83eb29302b814ecd8f57b6ac5b4_276563_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage.jpg 3738w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/bcrecommender-homepage.jpg alt width=3738 height=1127></figure><div class=post-content><h2 id=november-2017-update-and-goodbye>November 2017: Update and goodbye<a hidden class=anchor aria-hidden=true href=#november-2017-update-and-goodbye>#</a></h2><p>I&rsquo;ve decided to shut down Bandcamp Recommender (BCRecommender), despite hearing back from a few volunteers. The main reasons are:</p><ol><li>Bandcamp now shows album recommendations at the bottom of album pages. While this isn&rsquo;t quite the same as BCRecommender, I hope that it will evolve to a more comprehensive recommender system.</li><li>I tried to contact Bandcamp to get their support for the continued running of BCRecommender. I have not heard back from them. It would have been nice to receive some acknowledgement that they find BCRecommender useful.</li><li>As discussed below, I don&rsquo;t have much time to spend on the project, and handing it off to other maintainers would have been time-consuming. Given reasons 1 and 2, I don&rsquo;t feel like it&rsquo;s worth the effort. Thanks to everyone who&rsquo;s contacted me – you&rsquo;re awesome!</li></ol><h2 id=september-2017-original-announcement>September 2017: Original announcement<a hidden class=anchor aria-hidden=true href=#september-2017-original-announcement>#</a></h2><p>I <a href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/>released the first version of Bandcamp Recommender (BCRecommender) about three years ago</a>, with the main goal of surfacing music recommendations from <a href=https://bandcamp.com/ target=_blank rel=noopener>Bandcamp</a>. A secondary goal was learning more about building and marketing a standalone web app. As such, I shared a few posts about BCRecommender over the years:</p><ul><li>Initial posts on <a href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/>the motivation behind building BCRecommender</a>, <a href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/>original system layout</a>, and <a href=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/>the recommendation engine</a>.</li><li>Marketing-oriented posts on <a href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/>applying the Traction Book&rsquo;s framework to BCRecommender</a>, followed by <a href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/>an update on traction successes and failures</a>, and another post on <a href=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/>finding some SEO success</a>.</li><li>Later architectural changes, including <a href=https://yanirseroussi.com/2015/07/31/goodbye-parse-com/>moving away from Parse.com</a> and <a href=https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/>migrating from MongoDB to Elasticsearch</a>.</li></ul><p>The last of the above posts was published in November 2015 – almost two years ago. Most of the work on BCRecommender was done up to that point, when my main focus was on part-time contracting while working on my own projects. However, since January 2016 I&rsquo;ve mostly been working full-time, so I haven&rsquo;t had the time to give enough attention to the project. Therefore, it looks like it&rsquo;s time for me to say goodbye to BCRecommender.</p><p>Despite the lack of attention, about 5,000 people still visit BCRecommender every month (down from a peak of around 9,000). I know that people find it useful, even though it hasn&rsquo;t been functionally updated in a long time (though the recommendations have been refreshed a few times). In an ideal world, BCRecommender would be replaced by algorithmic recommendations from Bandcamp. But unfortunately, Bandcamp still doesn&rsquo;t offer personalised recommendations. This is a shame, because such recommendations could be of great benefit to both artists and fans. Millions of tracks and albums have been published on Bandcamp, meaning that serving personalised recommendations that cover their full catalogue can only be achieved using algorithms. However, it seems like they&rsquo;re not interested in building this kind of functionality.</p><p>Rather than simply pulling the plug on BCRecommender, I thought I&rsquo;d put a call out to see if anyone is interested in maintaining it. I&rsquo;m happy to open source the code and hand the project over to someone else if it means it would be in good hands. With a little bit of work, BCRecommender can be turned into a full Bandcamp-based personalised radio station. If you think you&rsquo;d be a good fit for maintaining the project, <a href=https://yanirseroussi.com/about/>drop me a line</a> and we can discuss further. If you just love BCRecommender, you can also let Bandcamp know that you want them to implement algorithmic recommendations (e.g., on <a href=https://twitter.com/bandcamp target=_blank rel=noopener>Twitter</a> or by emailing <a href=mailto:support@bandcamp.com>support@bandcamp.com</a>). I&rsquo;ll keep BCRecommender alive for about two more months and see if I get any responses. Either way, I&rsquo;ll be saying goodbye to maintaining it before the end of the year.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bandcamp/>Bandcamp</a></li><li><a href=https://yanirseroussi.com/tags/bcrecommender/>BCRecommender</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on x" href="https://x.com/intent/tweet/?text=State%20of%20Bandcamp%20Recommender%2c%20Late%202017&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f&amp;hashtags=Bandcamp%2cBCRecommender"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f&amp;title=State%20of%20Bandcamp%20Recommender%2c%20Late%202017&amp;summary=State%20of%20Bandcamp%20Recommender%2c%20Late%202017&amp;source=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f&title=State%20of%20Bandcamp%20Recommender%2c%20Late%202017"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on whatsapp" href="https://api.whatsapp.com/send?text=State%20of%20Bandcamp%20Recommender%2c%20Late%202017%20-%20https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on telegram" href="https://telegram.me/share/url?text=State%20of%20Bandcamp%20Recommender%2c%20Late%202017&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share State of Bandcamp Recommender, Late 2017 on ycombinator" href="https://news.ycombinator.com/submitlink?t=State%20of%20Bandcamp%20Recommender%2c%20Late%202017&u=https%3a%2f%2fyanirseroussi.com%2f2017%2f09%2f02%2fstate-of-bandcamp-recommender%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1880><div class=comment-header><a href=#comment-1880><img class=comment-avatar src="https://www.gravatar.com/avatar/5a363ddb605d33dd6c30d9aca7fdde59?s=50"><p class=comment-info><strong>Shanky</strong><br><small>2017-10-15 20:08:20</small></p></a></div><div class="comment-body post-content">Cool&mldr;not sure why and when i subscribed to your mailing list. and now quite surprised to hear that Bandcamp Recommender was your project.
 i am bandcamp freak&mldr; Bandcamp has recently strarted recommendations at the bottom ;-) seems primitive though. example <a href=https://ogreyouasshole.bandcamp.com/album/crossword-lost-sigh-days-james-mcnew-remixes target=_blank rel=noopener>https://ogreyouasshole.bandcamp.com/album/crossword-lost-sigh-days-james-mcnew-remixes</a>
 would love to hear about the basic logic you used behind the &ldquo;recommendations&rdquo; . I have no technical knowledge at all but a few years ago thought of a basic recommendation model ..but couldnt take it forward though&mldr;i thought &lsquo;contextualizing&rsquo; artists would be a cool way to connect bands.
diff --git a/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/index.html b/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/index.html
index 050e2695d..6c4979ab4 100644
--- a/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/index.html
+++ b/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/index.html
@@ -3,7 +3,7 @@
 <cite><a href=https://quoteinvestigator.com/2015/07/17/methods/ title=https://quoteinvestigator.com/2015/07/17/methods/ target=_blank rel=noopener>Harrington Emerson (1911)</a></cite></footer></blockquote><p><b id=become-a-data-science-freelancer>I want to become a data science freelancer. Can you provide some advice?</b></p><p class=indent-1>As with any freelancing job, expect to spend much of your time on sales and networking. I've only <a href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/>explored the freelancing path briefly</a>, but <a href=https://berlinbuzzwords.de/sites/berlinbuzzwords.de/files/media/documents/radim_rehurek-so_you_want_to_be_a_data_science_consultant.pdf>Radim Řehůřek has published great slides on the topic</a>. If you're thinking of freelancing as a way of gaining financial independence, also consider <a href=https://minafi.com/interactive-guide-early-retirement-financial-independence/>spending less, earning more, and investing wisely</a>.</p><p><b id=data-science-degree>Can you recommend an academic data science degree?</b></p><p class=indent-1>Sorry, but I don't know much about those degrees. <a href=https://gorelik.net/2017/05/29/dont-study-data-science/>Boris Gorelik has some interesting thoughts on studying data science</a>.</p><p><b id=be-my-mentor>Will you be my mentor?</b></p><p class=indent-1>Probably not, unless you're hard-working, independent, and doing something I find interesting. Feel free to <a href=https://yanirseroussi.com/about/>contact me</a> if you believe we'd both find the relationship beneficial.</p><p><b id=help-with-my-project>Can you help with my project?</b></p><p class=indent-1>Possibly. If you think I'd find your project exciting, please do <a href=https://yanirseroussi.com/about/>contact me</a>.</p><hr><p><b id=ethics>What about ethics?</b></p><p class=indent-1>What about them? There isn't a single definition of right and wrong, as <a href=https://en.wikipedia.org/wiki/The_Righteous_Mind>morality is multi-dimensional</a>. I believe it's important to <a href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/>question your own choices</a>, and <a href=https://www.kdnuggets.com/2015/05/should-data-science-do-that.html>avoid applying data science blindly</a>. For me, this means <a href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/>divesting from harmful industries like fossil fuels</a> and striving to go beyond the creation of <a href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/>greedy robots</a> (among other things).</p><p><b id=data-driven-manager>I&rsquo;m a manager. When should I hire a data scientist and start using machine learning?</b></p><p class=indent-1>There's a good chance <a href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/>you don't need a data scientist yet</a>, but you should be aware of <a href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/>common pitfalls when trying to be data-driven</a>. It's also worth reading Paras Chopra's post on <a href=https://growth.wingify.com/what-you-need-to-know-before-you-board-the-machine-learning-train-a81c513098fe>what you need to know before you board the machine learning train</a>.</p><p><b id=spam>Do you want to buy my products or services?</b></p><p class=indent-1>No. If I did, I'd contact you.</p><p><b id=other-questions>I have a question that isn&rsquo;t answered here or anywhere on the internet, and I think you can help. Can I contact you?</b></p><p class=indent-1>Sure, <a href=https://yanirseroussi.com/about/>use the form on this page</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-business/>Data Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/frequently-asked-questions/>Frequently Asked Questions</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on x" href="https://x.com/intent/tweet/?text=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f&amp;hashtags=career%2cdatabusiness%2cdatascience%2cfrequentlyaskedquestions"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f&amp;title=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs&amp;summary=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs&amp;source=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f&title=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on whatsapp" href="https://api.whatsapp.com/send?text=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs%20-%20https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on telegram" href="https://telegram.me/share/url?text=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs&amp;url=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Advice for aspiring data scientists and other FAQs on ycombinator" href="https://news.ycombinator.com/submitlink?t=Advice%20for%20aspiring%20data%20scientists%20and%20other%20FAQs&u=https%3a%2f%2fyanirseroussi.com%2f2017%2f10%2f15%2fadvice-for-aspiring-data-scientists-and-other-faqs%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1872><div class=comment-header><a href=#comment-1872><img class=comment-avatar src="https://www.gravatar.com/avatar/95cb4c667a279f91dc647c3a78dccf96?s=50"><p class=comment-info><strong>Simon</strong><br><small>2017-10-15 13:26:01</small></p></a></div><div class="comment-body post-content">Thanks so much for sharing this Yanir!</div></div><div class=comment-level-0 id=comment-1877><div class=comment-header><a href=#comment-1877><img class=comment-avatar src="https://www.gravatar.com/avatar/3e83196ec5d22b66453107ead83adc58?s=50"><p class=comment-info><strong>Eric Colson</strong><br><small>2017-10-15 16:38:31</small></p></a></div><div class="comment-body post-content"><p>Indeed, such questions seem to be very recurring. Thanks for providing answers to help guide folks. I might add a few things:</p><p>when ready for the job search&mldr; Advice to Data Scientists on Where to Work
 <a href=http://multithreaded.stitchfix.com/blog/2015/03/31/advice-for-data-scientists/ target=_blank rel=noopener>http://multithreaded.stitchfix.com/blog/2015/03/31/advice-for-data-scientists/</a></p><p>if you are going to get into data science, do it for the right reasons. Let your passion drive!
 <a href=https://www.quora.com/How-do-I-move-from-data-scientist-to-data-science-management target=_blank rel=noopener>https://www.quora.com/How-do-I-move-from-data-scientist-to-data-science-management</a></p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
diff --git a/2018/07/22/defining-data-science-in-2018/index.html b/2018/07/22/defining-data-science-in-2018/index.html
index 6dd5d47d3..fcb637a39 100644
--- a/2018/07/22/defining-data-science-in-2018/index.html
+++ b/2018/07/22/defining-data-science-in-2018/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="analytics,artificial intelligence,business,data science,machine learning,statistics"><meta name=description content="Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Defining data science in 2018"><meta property="og:description" content="Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/"><meta property="og:image" content="https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2018-07-22T08:27:43+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here.jpg"><meta name=twitter:title content="Defining data science in 2018"><meta name=twitter:description content="Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Defining data science in 2018","item":"https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Defining data science in 2018","name":"Defining data science in 2018","description":"Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions.","keywords":["analytics","artificial intelligence","business","data science","machine learning","statistics"],"articleBody":"I got my first data science job in 2012, the year Harvard Business Review announced data scientist to be the sexiest job of the 21st century. Two years later, I published a post on my then-favourite definition of data science, as the intersection between software engineering and statistics. Unfortunately, that definition became somewhat irrelevant as more and more people jumped on the data science bandwagon – possibly to the point of making data scientist useless as a job title. However, I still call myself a data scientist. Even better – I still get paid for being a data scientist. But what does it mean? What do I actually do here? This article is a short summary of my understanding of the definition of data science in 2018.\nIt’s not all about machine learning As I was wrapping up my PhD in 2012, I started thinking about my next steps. I knew I wanted to get back to working in the tech industry, ideally with a small startup. But it wasn’t clear to me how to market myself – my LinkedIn title at the time was “software engineer with a research background”, which is a bit of a mouthful. Around that time I heard about Kaggle and decided to try competing. This went pretty well, and exposed me to the data science community globally and in Melbourne, where I was living at the time. That’s how I first met Adam Neumann, the founder of Giveable, a startup that aimed to recommend gifts based on social networking data. Upon graduating, I joined Giveable as a data scientist. Changing my LinkedIn title quickly led to many other offers, but I was happy to be working on Giveable – I felt fortunate to have found a startup job that was related to my PhD research on recommender systems.\nMy understanding of data science at the time was heavily influenced by Kaggle and the tech industry. Kaggle was only about predictive modelling competitions back then, and so I believed that data science is about using machine learning to build models and deploy them as part of various applications. I was very comfortable with that definition, having spent my PhD years on several predictive modelling tasks, and having worked as a software engineer prior to that.\nThings have changed considerably since 2012. It is now much easier to deploy machine learning models, even without a deep understanding of how they work. Many more people call themselves data scientists, including some who are more focused on data analysis than on building data products. Even Kaggle – which is now owned by Google – has broadened its scope beyond modelling competitions to support other types of analysis. Numerous articles have been published on the meaning of data science in the past six years. We seem to be going towards a broad definition of the field, which includes any type of general data analysis. This trend of broadening the definition may make data scientist somewhat useless as a job title. However, I believe that data science tasks remain useful, as shown by the following definitions.\nRecent definitions by Hernán, Hawkins, and Dubossarsky In a recent article, Hernán et al. classify data science tasks into three types: description, prediction, and causal inference. Like other authors, they argue that causal inference has been neglected by traditional statistics and some scientific disciplines. They claim that the emergence of data science is an opportunity to get causal inference “right”. Further, they emphasise the importance of domain expert knowledge, which is essential in causal inference. Defining data science in this broad manner seems to capture the essence of what the field is about these days. However, purely descriptive tasks are still often performed by data analysts rather than scientists. And the distinction between prediction and causal inference can be a bit fuzzy, especially as the tools for the latter are at a lower level of maturity. In addition, while I agree with Hernán et al. that domain expertise is important, it seems unlikely that this will forever be the case. No one is born an expert – expertise is gained by learning from and interacting with the world. Therefore, it’s plausible that gaining expertise can and will be automated. Further, there are numerous cases where experts were proven to be wrong. For example, it wasn’t so long ago that doctors recommended smoking.\nDespite the importance of domain knowledge, one can argue that scientists that specialise in a single domain are not data scientists. In fact, the ability to go beyond one domain and think of data in a more abstract manner is what makes a data scientist. Applying this abstract knowledge often requires some domain expertise or input from domain experts, but most data science techniques are not domain-specific – they can be applied to many different problems. John Hawkins explains this point well in an article titled why all scientists are not data scientists:\nThose scientists and statisticians who have focused themselves on understanding the limitations and possibilities of making inferences from experimental data are the ones who are the forerunners to data scientists. They have a skill which transcends the particulars of what it takes to do lab work on cell cultures, or field studies for ecology etc. Their core skill involves thinking about the data involved at an abstracted level. To ask the question “given data with these properties, what conclusions can we draw?”\nFinally, according to Eugene Dubossarsky, “there’s only one purpose to data science, and that is to support decisions. And more specifically, to make better decisions. That should be something no one can argue with.” This goal-focused definition is unsurprising, given the fact that Eugene runs a training and consulting business and has been working in the field for over 20 years. I’m not going to argue with him, but to put it all together, we can define data science as a field that deals with description, prediction, and causal inference from data in a manner that is both domain-independent and domain-aware, with the ultimate goal of supporting decisions.\nWhat about AI? Everyone loves a good buzzword, and these days AI (Artificial Intelligence) is one of the hottest buzzwords. However, despite what some people may try to tell you, AI is unlikely to make data science obsolete any time soon. Following the above definition, as long as there is a need to make decisions based on data, there will be a need for data scientists. This includes decisions that aren’t made by humans, as data scientists are involved in building systems that make decisions autonomously.\nThe resurgence of AI feels somewhat amusing given my personal experience. One of the reasons I decided to pursue a PhD in natural language processing and personalisation was my interest in what I considered to be AI back in 2008. My initial introduction to the field was through an AI course and a project I did as part of my bachelor’s degree in computer science. However, by the time I graduated from my PhD, saying that I’m an AI expert seemed less useful than calling myself a data scientist. It may be that the field is about to shift again, and that rebranding as an AI expert would be more beneficial (though I’d be doing exactly the same work). Titles are somewhat silly – I’m going to continue working with data to support decisions for as long as there is demand for this kind of work and I continue enjoying it. There is plenty to learn and develop in this area, regardless of buzzwords and sexy titles.\n","wordCount":"1264","inLanguage":"en","image":"https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here.jpg","datePublished":"2018-07-22T08:27:43Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Defining data science in 2018</h1><div class=post-meta><span title='2018-07-22 08:27:43 +0000 UTC'>July 22, 2018</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here_hu2e849c7220f0ea4a04e1f6ecb54005af_335268_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here_hu2e849c7220f0ea4a04e1f6ecb54005af_335268_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here_hu2e849c7220f0ea4a04e1f6ecb54005af_335268_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here_hu2e849c7220f0ea4a04e1f6ecb54005af_335268_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here.jpg 1278w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/what-would-you-say-you-do-here.jpg alt width=1278 height=686></figure><div class=post-content><p>I got my first data science job in 2012, the year <a href=https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century target=_blank rel=noopener>Harvard Business Review announced data scientist to be the sexiest job of the 21st century</a>. Two years later, I published <a href=https://yanirseroussi.com/2014/10/23/what-is-data-science/>a post on my then-favourite definition of data science</a>, as the intersection between software engineering and statistics. Unfortunately, that definition became somewhat irrelevant as more and more people jumped on the data science bandwagon – possibly to the point of <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>making data scientist useless as a job title</a>. However, I still call myself a data scientist. Even better – <a href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/>I still get paid for being a data scientist</a>. But what does it mean? What do I actually do here? This article is a short summary of my understanding of the definition of data science in 2018.</p><h2 id=its-not-all-about-machine-learning>It&rsquo;s not all about machine learning<a hidden class=anchor aria-hidden=true href=#its-not-all-about-machine-learning>#</a></h2><p>As I was wrapping up my PhD in 2012, I started thinking about my next steps. I knew I wanted to get back to working in the tech industry, ideally with a small startup. But it wasn&rsquo;t clear to me how to market myself – my LinkedIn title at the time was <em>&ldquo;software engineer with a research background&rdquo;</em>, which is a bit of a mouthful. Around that time I heard about <a href=https://www.kaggle.com/ target=_blank rel=noopener>Kaggle</a> and decided to try competing. <a href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/>This went pretty well</a>, and exposed me to the data science community globally and in Melbourne, where I was living at the time. That&rsquo;s how I first met Adam Neumann, the founder of Giveable, a startup that aimed to recommend gifts based on social networking data. Upon graduating, I joined Giveable as a data scientist. Changing my LinkedIn title quickly led to many other offers, but I was happy to be working on Giveable – I felt fortunate to have found a startup job that was related to my PhD research on recommender systems.</p><p>My understanding of data science at the time was heavily influenced by Kaggle and the tech industry. Kaggle was only about predictive modelling competitions back then, and so I believed that data science is about using machine learning to build models and deploy them as part of various applications. I was very comfortable with that definition, having spent my PhD years on several predictive modelling tasks, and having worked as a software engineer prior to that.</p><p>Things have changed considerably since 2012. It is now much easier to deploy machine learning models, <a href="https://www.youtube.com/watch?v=YOIo09qjVl4" target=_blank rel=noopener>even without a deep understanding of how they work</a>. Many more people call themselves data scientists, <a href=https://eng.lyft.com/whats-in-a-name-ce42f419d16c target=_blank rel=noopener>including some who are more focused on data analysis than on building data products</a>. Even Kaggle – which is now owned by Google – <a href="https://www.youtube.com/watch?v=AoRSIdLpFqU" target=_blank rel=noopener>has broadened its scope beyond modelling competitions to support other types of analysis</a>. Numerous articles have been published on the meaning of data science in the past six years. We seem to be going towards a broad definition of the field, which includes any type of general data analysis. This trend of broadening the definition <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>may make data scientist somewhat useless as a job title</a>. However, I believe that data science tasks remain useful, as shown by the following definitions.</p><h2 id=recent-definitions-by-hernán-hawkins-and-dubossarsky>Recent definitions by Hernán, Hawkins, and Dubossarsky<a hidden class=anchor aria-hidden=true href=#recent-definitions-by-hernán-hawkins-and-dubossarsky>#</a></h2><p>In a <a href=https://arxiv.org/pdf/1804.10846.pdf target=_blank rel=noopener>recent article</a>, Hernán et al. classify data science tasks into three types: <em>description</em>, <em>prediction</em>, and <em>causal inference</em>. Like other authors, they argue that causal inference has been neglected by traditional statistics and some scientific disciplines. They claim that the emergence of data science is an opportunity to get causal inference &ldquo;right&rdquo;. Further, they emphasise the importance of domain expert knowledge, <a href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/>which is essential in causal inference</a>. Defining data science in this broad manner seems to capture the essence of what the field is about these days. However, purely descriptive tasks are still often performed by data <em>analysts</em> rather than <em>scientists</em>. And the distinction between prediction and causal inference can be a bit fuzzy, especially as the tools for the latter are at a lower level of maturity. In addition, while I agree with Hernán et al. that domain expertise is important, it seems unlikely that this will forever be the case. No one is born an expert – expertise is gained by learning from and interacting with the world. Therefore, it&rsquo;s plausible that gaining expertise can and will be automated. Further, there are numerous cases where experts were proven to be wrong. For example, it wasn&rsquo;t so long ago that <a href=https://www.healio.com/hematology-oncology/news/print/hemonc-today/%7B241d62a7-fe6e-4c5b-9fed-a33cc6e4bd7c%7D/cigarettes-were-once-physician-tested-approved target=_blank rel=noopener>doctors recommended smoking</a>.</p><p>Despite the importance of domain knowledge, one can argue that scientists that specialise in a single domain are not data scientists. In fact, the ability to go beyond one domain and think of data in a more abstract manner is what makes a data scientist. Applying this abstract knowledge often requires some domain expertise or input from domain experts, but most data science techniques are not domain-specific – they can be applied to many different problems. John Hawkins explains this point well in an article titled <em><a href=https://www.linkedin.com/pulse/why-all-scientists-data-john-hawkins target=_blank rel=noopener>why all scientists are not data scientists</a></em>:</p><blockquote><p>Those scientists and statisticians who have focused themselves on understanding the limitations and possibilities of making inferences from experimental data are the ones who are the forerunners to data scientists. They have a skill which transcends the particulars of what it takes to do lab work on cell cultures, or field studies for ecology etc. Their core skill involves thinking about the data involved at an abstracted level. To ask the question &ldquo;given data with these properties, what conclusions can we draw?&rdquo;</p></blockquote><p>Finally, <a href=https://www.superdatascience.com/podcast-one-purpose-data-science-truth-analytics/ target=_blank rel=noopener>according to Eugene Dubossarsky</a>, <em>&ldquo;there&rsquo;s only one purpose to data science, and that is to support decisions. And more specifically, to make better decisions. That should be something no one can argue with.&rdquo;</em> This goal-focused definition is unsurprising, given the fact that Eugene runs a training and consulting business and has been working in the field for over 20 years. I&rsquo;m not going to argue with him, but to put it all together, <strong>we can define data science as a field that deals with description, prediction, and causal inference from data in a manner that is both domain-independent and domain-aware, with the ultimate goal of supporting decisions</strong>.</p><h2 id=what-about-ai>What about AI?<a hidden class=anchor aria-hidden=true href=#what-about-ai>#</a></h2><p>Everyone loves a good buzzword, and these days AI (Artificial Intelligence) is one of the hottest buzzwords. However, despite <a href=https://www.forbes.com/sites/valleyvoices/2017/01/31/the-rise-of-ai-will-force-a-new-breed-of-data-scientist/ target=_blank rel=noopener>what some people may try to tell you</a>, AI is unlikely to make data science obsolete any time soon. Following the above definition, as long as there is a need to make decisions based on data, there will be a need for data scientists. This includes decisions that aren&rsquo;t made by humans, as data scientists are involved in building systems that make decisions autonomously.</p><p>The resurgence of AI feels somewhat amusing given my personal experience. One of the reasons I decided to pursue a PhD in natural language processing and personalisation was my interest in what I considered to be AI back in 2008. My initial introduction to the field was through an AI course and a project I did as part of my bachelor&rsquo;s degree in computer science. However, by the time I graduated from my PhD, saying that I&rsquo;m an AI expert seemed less useful than calling myself a data scientist. It may be that the field is about to shift again, and that rebranding as an AI expert would be more beneficial (though I&rsquo;d be doing exactly the same work). Titles are somewhat silly – I&rsquo;m going to continue working with data to support decisions for as long as there is demand for this kind of work and I continue enjoying it. There is plenty to learn and develop in this area, regardless of buzzwords and sexy titles.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on x" href="https://x.com/intent/tweet/?text=Defining%20data%20science%20in%202018&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f&amp;hashtags=analytics%2cartificialintelligence%2cbusiness%2cdatascience%2cmachinelearning%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f&amp;title=Defining%20data%20science%20in%202018&amp;summary=Defining%20data%20science%20in%202018&amp;source=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f&title=Defining%20data%20science%20in%202018"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on whatsapp" href="https://api.whatsapp.com/send?text=Defining%20data%20science%20in%202018%20-%20https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on telegram" href="https://telegram.me/share/url?text=Defining%20data%20science%20in%202018&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Defining data science in 2018 on ycombinator" href="https://news.ycombinator.com/submitlink?t=Defining%20data%20science%20in%202018&u=https%3a%2f%2fyanirseroussi.com%2f2018%2f07%2f22%2fdefining-data-science-in-2018%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-2905><div class=comment-header><a href=#comment-2905><img class=comment-avatar src="https://www.gravatar.com/avatar/dd6b3f4c12022dad24b54b933513ad84?s=50"><p class=comment-info><strong>Pravin</strong><br><small>2018-07-24 06:29:02</small></p></a></div><div class="comment-body post-content"><p>Great set of definitions and path of evolutions here!</p><p>There has to be chaos and confusion as it evolves surely, but the consensus as you very well mentioned is decisions. Anything done in the data world, if not leading to decisions is not quite viable in long term.</p><p>Thanks for sharing your thoughts, love reading your blog.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2018/11/03/reflections-on-remote-data-science-work/index.html b/2018/11/03/reflections-on-remote-data-science-work/index.html
index 6b5ddbe59..4c8b8b8eb 100644
--- a/2018/11/03/reflections-on-remote-data-science-work/index.html
+++ b/2018/11/03/reflections-on-remote-data-science-work/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Automattic,career,data science,remote work,WordPress"><meta name=description content="Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Reflections on remote data science work"><meta property="og:description" content="Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/"><meta property="og:image" content="https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2018-11-03T06:33:13+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach.jpg"><meta name=twitter:title content="Reflections on remote data science work"><meta name=twitter:description content="Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Reflections on remote data science work","item":"https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Reflections on remote data science work","name":"Reflections on remote data science work","description":"Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist.","keywords":["Automattic","career","data science","remote work","WordPress"],"articleBody":"It’s been about a year and a half since I joined Automattic as a remote data scientist. This is the longest I’ve been in one position since finishing my PhD in 2012. This is also the first time I’ve worked full-time with a fully-distributed team. In this post, I briefly discuss some of the top pluses and minuses of remote work, based on my experience so far.\n+ Flexible hours\n– Potentially boundless work By far, one of the top perks of remote work with a distributed team is truly flexible hours. I only have one or two synchronous meetings a week, and in the rest of my time I'm free to work the hours I prefer. No one expects me to be online at specific times, as long as the work gets done and I respond to pings within a reasonable time. As I'm a morning person, this means that I typically work a few hours in the early morning, take a long break (e.g., to surf or run some errands), and then work a few more hours in the afternoon or early evening. The potential downside of such flexibility is not being able to stop working, especially as most of my colleagues are in Europe and North America. I deal with this by avoiding all work communications during my designated non-work hours. For example, I don't have any work-related apps on my phone, I keep all my work tabs in a separate tab group, and I turn Slack off when I'm not working. I found that this approach sets enough of a boundary between my work and personal life, though I do end up thinking about work problems outside work hours occasionally. + More time for non-work activities\n– There’s never enough time! Not commuting freed up the equivalent of a workday in my schedule. In addition, having flexible hours means that I can make time in the middle of the day for leisure activities like surfing and diving. However, it's still a full-time job, so I'm not completely free to pursue non-work activities. It often feels like there isn't enough time in the day, as I can always think of more stuff I'd like to do. But my current situation is much better than having to commute on a daily basis. Even though it's been a relatively short time, I find the idea of going back to full-time office work hard to imagine. + No need to attend an office\n– Possible isolation from colleagues (and the real world) Offices – especially open-plan offices – are not great places to get work done. This is definitely the case with work that requires a high level of concentration over uninterrupted blocks of time, like coding and data analysis. Working from home is great for avoiding distractions – there's no need for silly horse blinders here (though I do enjoy looking at the bird and lizard action outside my window). One good thing about offices is the physical availability of colleagues. It's easy to ask others for feedback, socialise over drinks or shared meals, and keep up to date with company politics. Automattic works around the lack of daily physical interaction by running a few meetups a year. The number of people attending a meetup can vary from a handful for team meetups, to hundreds for the annual Grand Meetup. In all cases, the idea is to bring employees together for up to a week at a time to work and socialise. In my experience, the everyday distance creates a craving to attend meetups. I've never worked in a place where co-workers were so enthusiastic about spending so much time together – with non-distributed companies, team building is often seen as a chore. I suppose that the physical distance makes us appreciate the opportunity to be together and make the most of this precious time – it's a bit like being in a long-distance relationship. That said, in the majority of the time, isolation can be a problem. As I'm based in Australia, I probably feel it more than others – most of my teammates are offline during my work hours, which means that there's no one to chat with on Slack. This isn't a huge issue, but I do need to ensure I get enough social interaction through other avenues. As the jobs page of Bandcamp (another distributed company) used to say: \"If you do not have a strong social structure outside of work then employment at Bandcamp will likely lead to heart disease and an early death. We’re hiring!\" + Most communication is written\n– Information overload As Automattic is a fully-distributed company, most of the communication is done in writing. The main tools are Slack and internal forums called P2s (emails are rarely used). This makes catching up on the latest company news easy in comparison to places that rely more heavily on synchronous meetings. The downside of so much written communication is potential information overload. It is impossible to follow all the P2 posts, and even keeping up with stuff I should know can sometimes be overwhelming. I especially feel it in the mornings, as most of my colleagues work while I'm sleeping. Therefore, catching up on everything that happened overnight and responding to pings often takes over an hour – things are rarely as I left them when I last logged off. I experience this same feeling of being overwhelmed when coming back from vacation. Depending on the length of time away, it can take days to catch up. On the plus side, this process doesn't rely on someone filling me in – it's all there for me to read. + Free trips around the world\n– Jet lag and flying As noted above, Automatticians meet in person a few times a year. Since joining, I attended meetups in Montreal, Whistler, Playa del Carmen, Bali, and Orlando. In some cases, I used the opportunity for personal trips near the meetup locations. Such trips can be a lot of fun. However, the obvious downside when travelling from Australia is that getting to meetups usually involves days of jetlag and long flights (e.g., the 17-hour Dallas to Sydney trip). Nonetheless, I still enjoy the travel opportunities. For example, I doubt I would have ever visited Florida and snorkelled with manatees if it wasn't for Automattic. + Exposure to diverse opinions and people\n– Cultural differences can pose challenges Australia's population is made up of many migrants, especially in the tech industry. However, all such migrants have some familiarity with Australian culture and values. The composition of Automattic's workforce is even more diverse, and it lacks the unifying factor of everyone choosing to live in the same place. This is mostly positive, as I find the exposure to a diverse set of people interesting, and everyone tends to be friendly, welcoming, and focused on the work rather than on cultural differences. However, it's important to be aware of differences in communication styles. There's also a wider range of cultural sensitivities than when working with a more homogeneous group. Still, I haven't found it to be much of an issue, possibly because I'm already used to being a migrant. For example, moving to Australia from Israel required some adjustment of my communication style to be less direct. Closing words Overall, I like working with Automattic. For me, the positives outweigh the negatives, as evidenced by the fact that it’s the longest I’ve been in one position since 2012. Doing remote data science work doesn’t seem particularly different to doing any other sort of non-physical work remotely. I hope that more companies will join Automattic and the growing list of remote companies, and offer their employees the option to work from wherever they’re most productive.\nUpdate (March 2019): I also covered similar topics in a Data Science Sydney talk about a day in the life of a remote data scientist.\n","wordCount":"1321","inLanguage":"en","image":"https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach.jpg","datePublished":"2018-11-03T06:33:13Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Reflections on remote data science work</h1><div class=post-meta><span title='2018-11-03 06:33:13 +0000 UTC'>November 3, 2018</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach_hu1628e10df9028ac19609d5d417782f78_974371_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach_hu1628e10df9028ac19609d5d417782f78_974371_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach_hu1628e10df9028ac19609d5d417782f78_974371_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach_hu1628e10df9028ac19609d5d417782f78_974371_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach_hu1628e10df9028ac19609d5d417782f78_974371_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach.jpg 3998w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/angels-beach.jpg alt width=3998 height=2143></figure><div class=post-content><p>It&rsquo;s been about a year and a half since <a href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/>I joined Automattic as a remote data scientist</a>. This is the longest I&rsquo;ve been in one position since finishing my PhD in 2012. This is also the first time I&rsquo;ve worked full-time with a fully-distributed team. In this post, I briefly discuss some of the top pluses and minuses of remote work, based on my experience so far.</p><h2 id=-flexible-hoursbr-potentially-boundless-work>+ Flexible hours<br>– Potentially boundless work<a hidden class=anchor aria-hidden=true href=#-flexible-hoursbr-potentially-boundless-work>#</a></h2><p class=indent-1>By far, one of the top perks of remote work with a distributed team is truly flexible hours. I only have one or two synchronous meetings a week, and in the rest of my time I'm free to work the hours I prefer. No one expects me to be online at specific times, as long as the work gets done and I respond to pings within a reasonable time. As I'm a morning person, this means that I typically work a few hours in the early morning, take a long break (e.g., to surf or run some errands), and then work a few more hours in the afternoon or early evening.</p><p class=indent-1>The potential downside of such flexibility is not being able to stop working, especially as most of my colleagues are in Europe and North America. I deal with this by avoiding all work communications during my designated non-work hours. For example, I don't have any work-related apps on my phone, I keep all my work tabs in a separate tab group, and I turn Slack off when I'm not working. I found that this approach sets enough of a boundary between my work and personal life, though I do end up thinking about work problems outside work hours occasionally.</p><h2 id=-more-time-for-non-work-activitiesbr-theres-never-enough-time>+ More time for non-work activities<br>– There&rsquo;s never enough time!<a hidden class=anchor aria-hidden=true href=#-more-time-for-non-work-activitiesbr-theres-never-enough-time>#</a></h2><p class=indent-1>Not commuting freed up the equivalent of a workday in my schedule. In addition, having flexible hours means that I can make time in the middle of the day for leisure activities like surfing and diving. However, it's still a full-time job, so I'm not completely free to pursue non-work activities. It often feels like there isn't enough time in the day, as I can always think of more stuff I'd like to do. But my current situation is much better than having to commute on a daily basis. Even though it's been a relatively short time, I find the idea of going back to full-time office work hard to imagine.</p><h2 id=-no-need-to-attend-an-officebr-possible-isolation-from-colleagues-and-the-real-world>+ No need to attend an office<br>– Possible isolation from colleagues (and the real world)<a hidden class=anchor aria-hidden=true href=#-no-need-to-attend-an-officebr-possible-isolation-from-colleagues-and-the-real-world>#</a></h2><p class=indent-1>Offices &ndash; especially open-plan offices &ndash; are not great places to get work done. This is definitely the case with work that requires a high level of concentration over uninterrupted blocks of time, like coding and data analysis. Working from home is great for avoiding distractions &ndash; there's no need for <a href=https://techcrunch.com/2018/10/17/open-offices-have-driven-panasonic-to-make-horse-blinders-for-humans/>silly horse blinders</a> here (though I do enjoy looking at the bird and lizard action outside my window).</p><p class=indent-1>One good thing about offices is the physical availability of colleagues. It's easy to ask others for feedback, socialise over drinks or shared meals, and keep up to date with company politics. Automattic works around the lack of daily physical interaction by running a few meetups a year. The number of people attending a meetup can vary from a handful for team meetups, to hundreds for the annual Grand Meetup. In all cases, the idea is to bring employees together for up to a week at a time to work and socialise. In my experience, the everyday distance creates a craving to attend meetups. I've never worked in a place where co-workers were so enthusiastic about spending so much time together &ndash; with non-distributed companies, team building is often seen as a chore. I suppose that the physical distance makes us appreciate the opportunity to be together and make the most of this precious time &ndash; it's a bit like being in a long-distance relationship.</p><p class=indent-1>That said, in the majority of the time, isolation can be a problem. As I'm based in Australia, I probably feel it more than others &ndash; most of my teammates are offline during my work hours, which means that there's no one to chat with on Slack. This isn't a huge issue, but I do need to ensure I get enough social interaction through other avenues. As <a href=https://web.archive.org/web/20160102094215/Bandcamp.com/jobs>the jobs page of Bandcamp (another distributed company) used to say</a>: <i>"If you do not have a strong social structure outside of work then employment at Bandcamp will likely lead to heart disease and an early death. We’re hiring!"</i></p><h2 id=-most-communication-is-writtenbr-information-overload>+ Most communication is written<br>– Information overload<a hidden class=anchor aria-hidden=true href=#-most-communication-is-writtenbr-information-overload>#</a></h2><p class=indent-1>As Automattic is a fully-distributed company, most of the communication is done in writing. The main tools are Slack and internal forums called P2s (emails are rarely used). This makes catching up on the latest company news easy in comparison to places that rely more heavily on synchronous meetings. The downside of so much written communication is potential information overload. It is impossible to follow all the P2 posts, and even keeping up with stuff I <i>should</i> know can sometimes be overwhelming. I especially feel it in the mornings, as most of my colleagues work while I'm sleeping. Therefore, catching up on everything that happened overnight and responding to pings often takes over an hour &ndash; things are rarely as I left them when I last logged off. I experience this same feeling of being overwhelmed when coming back from vacation. Depending on the length of time away, it can take days to catch up. On the plus side, this process doesn't rely on someone filling me in &ndash; it's all there for me to read.</p><h2 id=-free-trips-around-the-worldbr-jet-lag-and-flying>+ Free trips around the world<br>– Jet lag and flying<a hidden class=anchor aria-hidden=true href=#-free-trips-around-the-worldbr-jet-lag-and-flying>#</a></h2><p class=indent-1>As noted above, Automatticians meet in person a few times a year. Since joining, I attended meetups in Montreal, Whistler, Playa del Carmen, Bali, and Orlando. In some cases, I used the opportunity for personal trips near the meetup locations. Such trips can be a lot of fun. However, the obvious downside when travelling from Australia is that getting to meetups usually involves days of jetlag and long flights (e.g., the 17-hour Dallas to Sydney trip). Nonetheless, I still enjoy the travel opportunities. For example, I doubt I would have ever visited Florida and snorkelled with manatees if it wasn't for Automattic.</p><h2 id=-exposure-to-diverse-opinions-and-peoplebr-cultural-differences-can-pose-challenges>+ Exposure to diverse opinions and people<br>– Cultural differences can pose challenges<a hidden class=anchor aria-hidden=true href=#-exposure-to-diverse-opinions-and-peoplebr-cultural-differences-can-pose-challenges>#</a></h2><p class=indent-1>Australia's population is made up of many migrants, especially in the tech industry. However, all such migrants have some familiarity with Australian culture and values. The composition of Automattic's workforce is even more diverse, and it lacks the unifying factor of everyone choosing to live in the same place. This is mostly positive, as I find the exposure to a diverse set of people interesting, and everyone tends to be friendly, welcoming, and focused on the work rather than on cultural differences. However, it's important to be aware of differences in communication styles. There's also a wider range of cultural sensitivities than when working with a more homogeneous group. Still, I haven't found it to be much of an issue, possibly because I'm already used to being a migrant. For example, moving to Australia from Israel required some adjustment of my communication style to be less direct.</p><h2 id=closing-words>Closing words<a hidden class=anchor aria-hidden=true href=#closing-words>#</a></h2><p>Overall, I like working with Automattic. For me, the positives outweigh the negatives, as evidenced by the fact that it&rsquo;s the longest I&rsquo;ve been in one position since 2012. Doing remote data science work doesn&rsquo;t seem particularly different to doing any other sort of non-physical work remotely. I hope that more companies will join Automattic and <a href=https://github.com/yanirs/established-remote>the growing list of remote companies</a>, and offer their employees the option to work from wherever they&rsquo;re most productive.</p><p><strong>Update (March 2019):</strong> I also covered similar topics in a Data Science Sydney talk about <a href="https://www.youtube.com/watch?v=5qbVEEtgWcY">a day in the life of a remote data scientist</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/automattic/>Automattic</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li><li><a href=https://yanirseroussi.com/tags/wordpress/>WordPress</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on x" href="https://x.com/intent/tweet/?text=Reflections%20on%20remote%20data%20science%20work&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f&amp;hashtags=Automattic%2ccareer%2cdatascience%2cremotework%2cWordPress"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f&amp;title=Reflections%20on%20remote%20data%20science%20work&amp;summary=Reflections%20on%20remote%20data%20science%20work&amp;source=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f&title=Reflections%20on%20remote%20data%20science%20work"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on whatsapp" href="https://api.whatsapp.com/send?text=Reflections%20on%20remote%20data%20science%20work%20-%20https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on telegram" href="https://telegram.me/share/url?text=Reflections%20on%20remote%20data%20science%20work&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Reflections on remote data science work on ycombinator" href="https://news.ycombinator.com/submitlink?t=Reflections%20on%20remote%20data%20science%20work&u=https%3a%2f%2fyanirseroussi.com%2f2018%2f11%2f03%2freflections-on-remote-data-science-work%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-3020><div class=comment-header><a href=#comment-3020><img class=comment-avatar src="https://www.gravatar.com/avatar/9f1de35db1a9174af8885039708185c9?s=50"><p class=comment-info><strong>parentscool</strong><br><small>2018-11-04 01:37:44</small></p></a></div><div class="comment-body post-content">I have been working remotely for WRI for nearly 2 years, and I can resonate with almost everything you have said. Great blog!</div></div><div class=comment-level-0 id=comment-3022><div class=comment-header><a href=#comment-3022><img class=comment-avatar src="https://www.gravatar.com/avatar/c692696b2addd9768ec241472e4b8d6a?s=50"><p class=comment-info><strong>Triparna Ray</strong><br><small>2018-11-05 05:53:51</small></p></a></div><div class="comment-body post-content">Interested. Though not trained as Data scientist yet but as BI consultant with experience over a decade. Let me know if you have any opportunity.</div></div><div class=comment-level-0 id=comment-3024><div class=comment-header><a href=#comment-3024><img class=comment-avatar src="https://www.gravatar.com/avatar/1499c586dfa43d9ca7b63b443bfb5e95?s=50"><p class=comment-info><strong>Sreekanth Yasa</strong><br><small>2018-11-06 02:29:18</small></p></a></div><div class="comment-body post-content">I am working for Accenture as Analyst. The article is very similar to my real life.
 I pursued data science from top university and worked on few capstone projects.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
diff --git a/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/index.html b/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/index.html
index 283cd513c..e4592210a 100644
--- a/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/index.html
+++ b/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="causal inference,data science,statistics"><meta name=description content="Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The most practical causal inference book I’ve read (is still a draft)"><meta property="og:description" content="Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/"><meta property="og:image" content="https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2018-12-24T02:37:50+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost.jpg"><meta name=twitter:title content="The most practical causal inference book I’ve read (is still a draft)"><meta name=twitter:description content="Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"The most practical causal inference book I’ve read (is still a draft)","item":"https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The most practical causal inference book I’ve read (is still a draft)","name":"The most practical causal inference book I’ve read (is still a draft)","description":"Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area.","keywords":["causal inference","data science","statistics"],"articleBody":"I’ve been interested in the area of causal inference in the past few years. In my opinion it’s more exciting and relevant to everyday life than more hyped data science areas like deep learning. However, I’ve found it hard to apply what I’ve learned about causal inference to my work. Now, I believe I’ve finally found a book with practical techniques that I can use on real problems: Causal Inference by Miguel Hernán and Jamie Robins. It is available for free from their site, but is still in draft mode. This post is a short summary of the reasons why I think Causal Inference is a great practical resource.\nOne of the things that sets Causal Inference apart from other books on the topic is the background of its authors. Hernán and Robins are both epidemiologists, which means they often have to deal with data with strong limitations on sample size and feasibility of experiments. Decisions driven by causal inference in epidemiology can often make the difference between life and death of individuals. Hence, the book is full of practical examples.\nThe book focuses on randomised controlled trials and well-defined interventions as the basis of causal inference from both experimental and observational data. As the authors show, even with randomised experiments, the analysis often requires using observational causal inference tools due to factors like selection and measurement biases. Their insistence on well-defined interventions is particularly refreshing, as one of the things that bothers me about the writings of Judea Pearl (a prominent researcher of causal inference) is the vagueness of statements like “smoking causes cancer” and “mud doesn’t cause rain”. The need for well-defined interventions was summarised by Hernán in the article Does water kill? A call for less casual causal inferences.\nUnlike some other resources, Causal Inference doesn’t appear to be too dogmatic about the framework used for modelling causality. I’m not an expert on where each idea originated, but it seems like the authors mix elements from the potential outcomes framework and from Pearl’s graphical models. They also don’t neglect time as an important consideration in cause-and-effect relationships. In fact, the third part of the book is dedicated to the topic of time-varying treatments and effects.\nThe practicality of the book is also demonstrated by the fact that it comes with code examples in multiple languages. In addition, the authors don’t dwell too much on the philosophy of causality. While it is a fascinating topic, the opening paragraphs of the book make its goals clear:\nBy reading this book you are expressing an interest in learning about causal inference. But, as a human being, you have already mastered the fundamental concepts of causal inference. You certainly know what a causal effect is; you clearly understand the difference between association and causation; and you have used this knowledge constantly throughout your life. In fact, had you not understood these causal concepts, you would have not survived long enough to read this chapter–or even to learn to read. As a toddler you would have jumped right into the swimming pool after observing that those who did so were later able to reach the jam jar. As a teenager, you would have skied down the most dangerous slopes after observing that those who did so were more likely to win the next ski race. As a parent, you would have refused to give antibiotics to your sick child after observing that those children who took their medicines were less likely to be playing in the park the next day.\nSince you already understand the definition of causal effect and the difference between association and causation, do not expect to gain deep conceptual insights from this chapter. Rather, the purpose of this chapter is to introduce mathematical notation that formalizes the causal intuition that you already possess. Make sure that you can match your causal intuition with the mathematical notation introduced here. This notation is necessary to precisely define causal concepts, and we will use it throughout the book.\nI won’t try to summarise the technical aspects of the book – partly because I don’t fully understand it all, and partly because the book itself is already a summary of a very rich research area. However, I’m likely to go back and reread the book in the future, with the goal of applying the techniques from the book to my work. I’d also like to take Hernán’s causal inference course as a way of practising what I’ve learned from the book. For people who want a non-technical summary of the topics covered by the book, I recommend the article The c-word: Scientific euphemisms do not improve causal inference from observational data. If you’re curious about other (less practical) causality books I’ve read, check out my causal inference resource list and my two previous posts on the topic: Why you should stop worrying about deep learning and deepen your understanding of causality instead and Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions.\n","wordCount":"831","inLanguage":"en","image":"https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost.jpg","datePublished":"2018-12-24T02:37:50Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">The most practical causal inference book I’ve read (is still a draft)</h1><div class=post-meta><span title='2018-12-24 02:37:50 +0000 UTC'>December 24, 2018</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost_hu60b33a1bef2586fcaccb307cd6388d77_2433611_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost_hu60b33a1bef2586fcaccb307cd6388d77_2433611_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost_hu60b33a1bef2586fcaccb307cd6388d77_2433611_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost_hu60b33a1bef2586fcaccb307cd6388d77_2433611_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost_hu60b33a1bef2586fcaccb307cd6388d77_2433611_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost.jpg 4210w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/chicken-egg-roost.jpg alt width=4210 height=2812></figure><div class=post-content><p>I&rsquo;ve been interested in the area of causal inference in the past few years. In my opinion <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>it&rsquo;s more exciting and relevant to everyday life than more hyped data science areas like deep learning</a>. However, I&rsquo;ve found it hard to apply what I&rsquo;ve learned about causal inference to my work. Now, I believe I&rsquo;ve finally found a book with practical techniques that I can use on real problems: <a href=https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/ target=_blank rel=noopener><em>Causal Inference</em></a> by Miguel Hernán and Jamie Robins. It is available for free from their site, but is still in draft mode. This post is a short summary of the reasons why I think <em>Causal Inference</em> is a great practical resource.</p><p>One of the things that sets <em>Causal Inference</em> apart from other books on the topic is the background of its authors. Hernán and Robins are both epidemiologists, which means they often have to deal with data with strong limitations on sample size and feasibility of experiments. Decisions driven by causal inference in epidemiology can often make the difference between life and death of individuals. Hence, the book is full of practical examples.</p><p>The book focuses on randomised controlled trials and well-defined interventions as the basis of causal inference from both experimental and observational data. As the authors show, even with randomised experiments, the analysis often requires using observational causal inference tools due to factors like selection and measurement biases. Their insistence on well-defined interventions is particularly refreshing, as one of the things that bothers me about the writings of Judea Pearl (a prominent researcher of causal inference) is <a href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/>the vagueness of statements like <em>&ldquo;smoking causes cancer&rdquo;</em> and <em>&ldquo;mud doesn&rsquo;t cause rain&rdquo;</em></a>. The need for well-defined interventions was summarised by Hernán in the article <a href=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5207342/ target=_blank rel=noopener><em>Does water kill? A call for less casual causal inferences</em></a>.</p><p>Unlike some other resources, <em>Causal Inference</em> doesn&rsquo;t appear to be too dogmatic about the framework used for modelling causality. I&rsquo;m not an expert on where each idea originated, but it seems like the authors mix elements from the <a href=https://en.wikipedia.org/wiki/Rubin_causal_model target=_blank rel=noopener>potential outcomes framework</a> and from <a href=https://en.wikipedia.org/wiki/Structural_equation_modeling target=_blank rel=noopener>Pearl&rsquo;s graphical models</a>. They also don&rsquo;t neglect time as an important consideration in cause-and-effect relationships. In fact, the third part of the book is dedicated to the topic of time-varying treatments and effects.</p><p>The practicality of the book is also demonstrated by the fact that it comes with code examples in multiple languages. In addition, the authors don&rsquo;t dwell too much on the philosophy of causality. While it is a fascinating topic, the opening paragraphs of the book make its goals clear:</p><blockquote><p>By reading this book you are expressing an interest in learning about causal inference. But, as a human being, you have already mastered the fundamental concepts of causal inference. You certainly know what a causal effect is; you clearly understand the difference between association and causation; and you have used this knowledge constantly throughout your life. In fact, had you not understood these causal concepts, you would have not survived long enough to read this chapter–or even to learn to read. As a toddler you would have jumped right into the swimming pool after observing that those who did so were later able to reach the jam jar. As a teenager, you would have skied down the most dangerous slopes after observing that those who did so were more likely to win the next ski race. As a parent, you would have refused to give antibiotics to your sick child after observing that those children who took their medicines were less likely to be playing in the park the next day.</p><p>Since you already understand the definition of causal effect and the difference between association and causation, do not expect to gain deep conceptual insights from this chapter. Rather, the purpose of this chapter is to introduce mathematical notation that formalizes the causal intuition that you already possess. Make sure that you can match your causal intuition with the mathematical notation introduced here. This notation is necessary to precisely define causal concepts, and we will use it throughout the book.</p></blockquote><p>I won&rsquo;t try to summarise the technical aspects of the book – partly because I don&rsquo;t fully understand it all, and partly because the book itself is already a summary of a very rich research area. However, I&rsquo;m likely to go back and reread the book in the future, with the goal of applying the techniques from the book to my work. I&rsquo;d also like to take <a href=https://www.edx.org/course/causal-diagrams-draw-assumptions-harvardx-ph559x target=_blank rel=noopener>Hernán&rsquo;s causal inference course</a> as a way of practising what I&rsquo;ve learned from the book. For people who want a non-technical summary of the topics covered by the book, I recommend the article <a href=https://ajph.aphapublications.org/doi/10.2105/AJPH.2018.304337 target=_blank rel=noopener><em>The c-word: Scientific euphemisms do not improve causal inference from observational data</em></a>. If you&rsquo;re curious about other (less practical) causality books I&rsquo;ve read, check out <a href=https://yanirseroussi.com/causal-inference-resources/>my causal inference resource list</a> and my two previous posts on the topic: <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/><em>Why you should stop worrying about deep learning and deepen your understanding of causality instead</em></a> and <a href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><em>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</em></a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on x" href="https://x.com/intent/tweet/?text=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f&amp;hashtags=causalinference%2cdatascience%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f&amp;title=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29&amp;summary=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29&amp;source=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f&title=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on whatsapp" href="https://api.whatsapp.com/send?text=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29%20-%20https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on telegram" href="https://telegram.me/share/url?text=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29&amp;url=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The most practical causal inference book I’ve read (is still a draft) on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20most%20practical%20causal%20inference%20book%20I%e2%80%99ve%20read%20%28is%20still%20a%20draft%29&u=https%3a%2f%2fyanirseroussi.com%2f2018%2f12%2f24%2fthe-most-practical-causal-inference-book-ive-read-is-still-a-draft%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/index.html b/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/index.html
index 932c22dc2..de0058695 100644
--- a/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/index.html
+++ b/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/index.html
@@ -13,7 +13,7 @@
 https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/dont-believe-everything-you-read-on-the-internet-lincoln.jpg 576w," src=https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/dont-believe-everything-you-read-on-the-internet-lincoln.jpg alt="Abraham Lincoln internet quote" loading=lazy></a></figure><p><strong>Update (Oct 2019)</strong>: I published <a href=https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/>a post</a> summarising a talk I gave on the topic, complete with <a href=https://github.com/yanirs/yanirs.github.io/blob/master/talks/bootstrapping-the-right-way/notebook.ipynb target=_blank rel=noopener>simulation code</a> that illustrates the issues with some bootstrapping algorithms.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bootstrapping/>Bootstrapping</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/hackers/>Hackers</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on x" href="https://x.com/intent/tweet/?text=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f&amp;hashtags=bootstrapping%2cdatascience%2chackers%2csoftwareengineering%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f&amp;title=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful&amp;summary=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful&amp;source=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f&title=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on whatsapp" href="https://api.whatsapp.com/send?text=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful%20-%20https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on telegram" href="https://telegram.me/share/url?text=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Hackers beware: Bootstrap sampling may be harmful on ycombinator" href="https://news.ycombinator.com/submitlink?t=Hackers%20beware%3a%20Bootstrap%20sampling%20may%20be%20harmful&u=https%3a%2f%2fyanirseroussi.com%2f2019%2f01%2f08%2fhackers-beware-bootstrap-sampling-may-be-harmful%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-3256><div class=comment-header><a href=#comment-3256><img class=comment-avatar src="https://www.gravatar.com/avatar/76347f8703adb9394f74bf150edb9b19?s=50"><p class=comment-info><strong>Ralph Haygood</strong><br><small>2019-01-08 00:09:43</small></p></a></div><div class="comment-body post-content">&ldquo;the basic bootstrap makes no assumption about the underlying distribution of the data&rdquo;: I suppose bootstrapping per se doesn&rsquo;t, but some things people like to use it for do. For example, suppose the original sample is from a Cauchy distribution, and bootstrapping is used to compute a confidence interval around the sample mean; no matter how many bootstrap replicates are used, the computed interval is worthless, because the original distribution doesn&rsquo;t have a mean. Of course, that&rsquo;s an extreme case unlikely to arise in practice, but it immediately raises doubt that guidelines like &ldquo;n ≥ 101 for the bootstrap t&rdquo; should be applied uncritically. As you obviously agree, it&rsquo;s best to know and think about where the data came from when deciding which statistical methods to apply.</div></div><div class=comment-level-0 id=comment-3265><div class=comment-header><a href=#comment-3265><img class=comment-avatar src="https://www.gravatar.com/avatar/f4207c379a26aa8bb4dbb9376b261d48?s=50"><p class=comment-info><strong>Ingo Rohlfing</strong><br><small>2019-01-09 09:52:57</small></p></a></div><div class="comment-body post-content">Learned a lot from the post. One question on the CIs for means and the difference between the means. If the two CIs for the means do not overlap, does it always imply that the difference is significant? Or can the error go in both ways, meaning that it is possible to have non-overlapping CIs and the CI of the difference includes 0?</div></div><div class=comment-level-1 id=comment-3267><div class=comment-header><a href=#comment-3267><img class=comment-avatar src="https://www.gravatar.com/avatar/f4207c379a26aa8bb4dbb9376b261d48?s=50"><p class=comment-info><strong>Ingo Rohlfing</strong><br><small>2019-01-09 12:21:19</small></p></a></div><div class="comment-body post-content">I found the answer myself in one of the posts you refer to: <a href=https://towardsdatascience.com/why-overlapping-confidence-intervals-mean-nothing-about-statistical-significance-48360559900a target=_blank rel=noopener>https://towardsdatascience.com/why-overlapping-confidence-intervals-mean-nothing-about-statistical-significance-48360559900a</a> Non-overlapping CIs imply significance</div></div><div class=comment-level-0 id=comment-3305><div class=comment-header><a href=#comment-3305><img class=comment-avatar src="https://www.gravatar.com/avatar/22d41e5b6ff197cd7900c0514d1bd305?s=50"><p class=comment-info><strong>Boris Gorelik</strong><br><small>2019-01-15 12:38:17</small></p></a></div><div class="comment-body post-content"><p>Reblogged this on <a href=http://gorelik.net/2019/01/15/hackers-beware-bootstrap-sampling-may-be-harmful/ rel=nofollow>Boris Gorelik</a> and commented:</p><p>Anything is better when bootstrapped. Read my co-worker&rsquo;s post on bootstrapping. Also make sure following the links Yanir gives to support his claims</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2019/10/06/bootstrapping-the-right-way/index.html b/2019/10/06/bootstrapping-the-right-way/index.html
index 38c94581a..14d310379 100644
--- a/2019/10/06/bootstrapping-the-right-way/index.html
+++ b/2019/10/06/bootstrapping-the-right-way/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/revenue-confidence-intervals_hua7f3e259e998045c935f50c75e8eb77d_43359_1500x0_resize_box_3.png 1500w," src=https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/revenue-confidence-intervals_hua7f3e259e998045c935f50c75e8eb77d_43359_800x0_resize_box_3.png alt="Revenue confidence intervals" loading=lazy></a></figure><p>The figure shows how the accuracy of confidence interval estimation varies by algorithm, sample size, and the number of bootstrapping resamples on a synthetic revenue dataset. This sort of dataset may occur in freemium scenarios, where several product variations are offered at a few price tiers, including a price of zero (i.e., free). In all cases, the dashed line denotes the requested confidence level of 95%, i.e., the true difference in means between the two revenue distributions should be inside the confidence interval in approximately 95% of the simulations for it to be accurate. Unfortunately, it is clear that both the percentile and BCa algorithms perform poorly on the simulated data. Even with a sample size of 10K, they both yield &ldquo;95%&rdquo; confidence intervals that contain the true difference in means less than 90% of the time, i.e., the intervals are too narrow. By contrast, the studentized algorithm gets much closer to the requested confidence level, but this comes at the price of considerably longer runtime due to the need for nested bootstrapping.</p><p>Note that the results presented in the talk are slightly different from the figure above. The difference is due to a small bug in the simulation code: I used a constant random seed for all the bootstrapping simulation iterations (every iteration still contained different data). This has led to the surprising finding that accuracy with 10,000 resamples was lower than with 1,000 resamples. I attributed that finding to dataset quirks, and noted that my results may not generalise to all cases. Indeed, I recently ran a similar set of experiments on different data as part of my work at <a href=https://automattic.com/work-with-us/ target=_blank rel=noopener>Automattic</a>, and found that the studentized algorithm accuracy wasn&rsquo;t as impressive as the results shown here.</p><p>In addition to synthetic data, the experiments I ran at Automattic included an implementation of an idea by my colleague, <a href=https://data.blog/author/dem0sh/ target=_blank rel=noopener>Demet Dagdelen</a>: Test accuracy on samples from the full population for a given period (e.g., all sales over a calendar year). In such cases, the full population is well-defined. Therefore, we know the value of the &ldquo;true&rdquo; parameters, and we can run the same simulations as on synthetic data. While I can&rsquo;t share that data, I can say that all algorithms performed much worse on real data than on simulated data. Therefore, we decided to follow the penultimate takeaway and use a parametric Bayesian approach for modelling our data. We may share insights from that line of work on <a href=https://data.blog target=_blank rel=noopener>data.blog</a> in the future. In the meantime, comments are very welcome!</p><p><strong>Update</strong>: You can find more accurate simulations in <a href=https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/>this post</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on x" href="https://x.com/intent/tweet/?text=Bootstrapping%20the%20right%20way%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f&amp;hashtags=analytics%2cdatascience%2csoftwareengineering%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f&amp;title=Bootstrapping%20the%20right%20way%3f&amp;summary=Bootstrapping%20the%20right%20way%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f&title=Bootstrapping%20the%20right%20way%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on whatsapp" href="https://api.whatsapp.com/send?text=Bootstrapping%20the%20right%20way%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on telegram" href="https://telegram.me/share/url?text=Bootstrapping%20the%20right%20way%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Bootstrapping the right way? on ycombinator" href="https://news.ycombinator.com/submitlink?t=Bootstrapping%20the%20right%20way%3f&u=https%3a%2f%2fyanirseroussi.com%2f2019%2f10%2f06%2fbootstrapping-the-right-way%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-18653><div class=comment-header><a href=#comment-18653><img class=comment-avatar src="https://www.gravatar.com/avatar/22d41e5b6ff197cd7900c0514d1bd305?s=50"><p class=comment-info><strong>Boris Gorelik</strong><br><small>2019-10-06 07:54:37</small></p></a></div><div class="comment-body post-content"><p>Reblogged this on <a href=http://gorelik.net/2019/10/06/bootstrapping-the-right-way/ rel=nofollow>Boris Gorelik</a> and commented:</p><p>Many years ago, I terribly overfit a model which caused losses of a lot of shekels (a LOT). It&rsquo;s not that I wasn&rsquo;t aware of the potential overfitting. I was. Among other things, I used several bootstrapping simulations. It turns out that I applied the bootstrapping in a wrong way. My particular problem was that I &ldquo;forgot&rdquo; about confounding parameters and that I &ldquo;forgot&rdquo; that peeping into the future is a bad thing.</p><p>Anyhow, Yanir Seroussi, my coworker data scientist, gave a very good talk on bootstrapping.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/index.html b/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/index.html
index af49ad81b..c5c0c742b 100644
--- a/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/index.html
+++ b/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Automattic,career,data science,remote work"><meta name=description content="Video of a talk I gave on remote data science work at the Data Science Sydney meetup."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="A day in the life of a remote data scientist"><meta property="og:description" content="Video of a talk I gave on remote data science work at the Data Science Sydney meetup."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/"><meta property="og:image" content="https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe.jpg"><meta property="article:section" content="posts"><meta property="article:published_time" content="2019-12-11T22:06:19+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe.jpg"><meta name=twitter:title content="A day in the life of a remote data scientist"><meta name=twitter:description content="Video of a talk I gave on remote data science work at the Data Science Sydney meetup."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"A day in the life of a remote data scientist","item":"https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"A day in the life of a remote data scientist","name":"A day in the life of a remote data scientist","description":"Video of a talk I gave on remote data science work at the Data Science Sydney meetup.","keywords":["Automattic","career","data science","remote work"],"articleBody":"Earlier this year, I gave a talk titled A Day in the Life of a Remote Data Scientist at the Data Science Sydney meetup. The talk covered similar ground to a post I published on remote data science work, with additional details on my daily schedule and projects, some gifs and Sydney jokes, heckling by the audience, and a Q\u0026A session. I managed to watch it a few months ago without cringing too much, so it’s about time to post it here. The slides are on my GitHub, as is my list of established remote companies, which you may find useful if you want to join the remote work fun.\n","wordCount":"110","inLanguage":"en","image":"https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe.jpg","datePublished":"2019-12-11T22:06:19Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">A day in the life of a remote data scientist</h1><div class=post-meta><span title='2019-12-11 22:06:19 +0000 UTC'>December 11, 2019</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe_hu3d03a01dcc18bc5be0e67db3d8d209a6_1872808_360x0_resize_q75_box.jpg 360w ,https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe_hu3d03a01dcc18bc5be0e67db3d8d209a6_1872808_480x0_resize_q75_box.jpg 480w ,https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe_hu3d03a01dcc18bc5be0e67db3d8d209a6_1872808_720x0_resize_q75_box.jpg 720w ,https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe_hu3d03a01dcc18bc5be0e67db3d8d209a6_1872808_1080x0_resize_q75_box.jpg 1080w ,https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe_hu3d03a01dcc18bc5be0e67db3d8d209a6_1872808_1500x0_resize_q75_box.jpg 1500w ,https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe.jpg 4989w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/remote-person-tossing-globe.jpg alt width=4989 height=3326></figure><div class=post-content><p>Earlier this year, I gave a talk titled <em>A Day in the Life of a Remote Data Scientist</em> at <a href=https://www.meetup.com/Data-Science-Sydney/ target=_blank rel=noopener>the Data Science Sydney meetup</a>. The talk covered similar ground to <a href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/>a post I published on remote data science work</a>, with additional details on my daily schedule and projects, some gifs and Sydney jokes, heckling by the audience, and a Q&amp;A session. I managed to watch it a few months ago without cringing too much, so it&rsquo;s about time to post it here. <a href=https://yanirs.github.io/talks/remote-data-scientist/ target=_blank rel=noopener>The slides are on my GitHub</a>, as is <a href=https://github.com/yanirs/established-remote/ target=_blank rel=noopener>my list of established remote companies</a>, which you may find useful if you want to join the remote work fun.</p><p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/5qbVEEtgWcY?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="YouTube video"></iframe></div></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/automattic/>Automattic</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on x" href="https://x.com/intent/tweet/?text=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f&amp;hashtags=Automattic%2ccareer%2cdatascience%2cremotework"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f&amp;title=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist&amp;summary=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist&amp;source=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f&title=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on whatsapp" href="https://api.whatsapp.com/send?text=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist%20-%20https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on telegram" href="https://telegram.me/share/url?text=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share A day in the life of a remote data scientist on ycombinator" href="https://news.ycombinator.com/submitlink?t=A%20day%20in%20the%20life%20of%20a%20remote%20data%20scientist&u=https%3a%2f%2fyanirseroussi.com%2f2019%2f12%2f12%2fa-day-in-the-life-of-a-remote-data-scientist%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2020/01/11/software-commodities-are-eating-interesting-data-science-work/index.html b/2020/01/11/software-commodities-are-eating-interesting-data-science-work/index.html
index 33bd3ecaa..0b7852aae 100644
--- a/2020/01/11/software-commodities-are-eating-interesting-data-science-work/index.html
+++ b/2020/01/11/software-commodities-are-eating-interesting-data-science-work/index.html
@@ -11,7 +11,7 @@
 https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/science-versus-engineering.png 581w," src=https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/science-versus-engineering.png alt="Information flow in science and engineering" loading=lazy></a><figcaption><p>Many problems in data “science” are actually engineering problems – described best by the flow on the right (<a href=https://fs.blog/2013/07/the-difference-between-science-and-engineering/ target=_blank rel=noopener>source</a>)</p></figcaption></figure><p>Some of my first jobs as a data scientist in industry involved building <a href=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/>recommender systems</a>. With recommender systems, much of the work is on the system around the recommendation algorithm. That is, building a recommender system was always mostly an engineering problem. However, these days we have services like <a href=https://aws.amazon.com/personalize/ target=_blank rel=noopener>AWS Personalize</a>, which does most of the heavy lifting around recommendation. This makes the deployment of recommender systems a pure engineering problem. Like many other problems, recommender systems have been commodified.</p><p><a href=https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/>I have not done much with deep learning</a>, but there the general trend is even more apparent: Useful innovations quickly turn into tools. Examples include library evolution from Theano to TensorFlow, and commodified prediction services from companies like Google, Amazon, and Microsoft. If you want to use a deep learning service in your application, you probably don&rsquo;t need a data scientist or even a machine learning engineer. A solid software engineer who can pick the right tools should be enough.</p><h2 id=how-to-remain-relevant>How to remain relevant?<a hidden class=anchor aria-hidden=true href=#how-to-remain-relevant>#</a></h2><p>So where does this leave us? It seems to be a more general phenomenon. Essentially every problem that requires specialised knowledge and is valuable ends up attracting repeatable solutions that obviate the need for deep thinking and manual work. These solutions are software commodities. Deploying them is a matter of writing some glue code and fitting them into the overall system – an engineering problem. Implementing data science components to compete with commodities may be interesting and fun, but it&rsquo;s usually a waste of time when there&rsquo;s a generic solution that is <a href=https://data.blog/2017/06/12/timeseries-analysis/ target=_blank rel=noopener>good enough</a>.</p><p>As an individual data scientist, <strong>what can you do when your speciality becomes a software commodity?</strong> I see a few options:</p><ol><li><strong>Embrace the engineering angle</strong>. Become good (or better) at engineering solutions. Be pragmatic. Do what it takes to get the job done. This is probably easier for data scientists like me, who have an engineering background, than for more research/analysis-oriented data scientists. Such data scientists sometimes sneer at engineering work, claiming it&rsquo;s &ldquo;fake&rdquo; data science. Fake or not, solid engineering tools can easily make stubborn data scientists obsolete.</li><li><strong>Keep building custom solutions even when viable commodities exist</strong>. While this may be more fun for the individual, I believe it isn&rsquo;t a sustainable approach. The cost of building and maintaining custom solutions will typically be higher than the cost of commodity solutions. Insisting on custom solutions seems like a recipe for becoming irrelevant.</li><li><strong>Keep adapting and moving to non-commodity areas</strong>. Some things are easier to automate than others. For example, building a machine learning pipeline when the problem is well-defined is relatively easy, but deciding what features to create typically requires some domain expertise. In addition, new research keeps coming out in areas that are less hot than machine learning. One such area is <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>causal inference</a>, where there are still solutions that are yet to be commodified.</li><li><strong>Move to the cutting edge</strong>. If you want to research novel methods, a &ldquo;standard&rdquo; data scientist position may not be for you. Many industry positions are focused on applying proven solutions to a specific organisation. If that doesn&rsquo;t sound like fun, you&rsquo;re better off moving to academia or joining a commercial research group.</li></ol><p>Are there any other options I don&rsquo;t see? Let me know in the comments!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on x" href="https://x.com/intent/tweet/?text=Software%20commodities%20are%20eating%20interesting%20data%20science%20work&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f&amp;hashtags=business%2ccareer%2cdatascience%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f&amp;title=Software%20commodities%20are%20eating%20interesting%20data%20science%20work&amp;summary=Software%20commodities%20are%20eating%20interesting%20data%20science%20work&amp;source=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f&title=Software%20commodities%20are%20eating%20interesting%20data%20science%20work"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on whatsapp" href="https://api.whatsapp.com/send?text=Software%20commodities%20are%20eating%20interesting%20data%20science%20work%20-%20https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on telegram" href="https://telegram.me/share/url?text=Software%20commodities%20are%20eating%20interesting%20data%20science%20work&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Software commodities are eating interesting data science work on ycombinator" href="https://news.ycombinator.com/submitlink?t=Software%20commodities%20are%20eating%20interesting%20data%20science%20work&u=https%3a%2f%2fyanirseroussi.com%2f2020%2f01%2f11%2fsoftware-commodities-are-eating-interesting-data-science-work%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-19204><div class=comment-header><a href=#comment-19204><img class=comment-avatar src="https://www.gravatar.com/avatar/831fe3b72cff0094cf7287a9a22f4ee2?s=50"><p class=comment-info><strong>John Chew</strong><br><small>2020-01-11 09:35:15</small></p></a></div><div class="comment-body post-content">This is the same conclusion I reached when deciding between deepening data science skills vs engineering; now I&rsquo;m deeper into cloud services and off-the-shelf ML tools.</div></div><div class=comment-level-0 id=comment-19218><div class=comment-header><a href=#comment-19218><img class=comment-avatar src="https://www.gravatar.com/avatar/22d41e5b6ff197cd7900c0514d1bd305?s=50"><p class=comment-info><strong>Boris Gorelik</strong><br><small>2020-01-12 20:08:29</small></p></a></div><div class="comment-body post-content">I&rsquo;d add (5) managing and (6) teaching. (I elaborated on this here <a href=https://gorelik.net/2020/01/12/software-commodities-are-eating-interesting-data-science-work-yanir-seroussi/ target=_blank rel=noopener>https://gorelik.net/2020/01/12/software-commodities-are-eating-interesting-data-science-work-yanir-seroussi/</a>)</div></div><div class=comment-level-1 id=comment-19219><div class=comment-header><a href=#comment-19219><img class=comment-avatar src="https://www.gravatar.com/avatar/dda019c47a6183120608a6aeac2db6c5?s=50"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2020-01-12 21:20:19</small></p></a></div><div class="comment-body post-content">Good points, thanks Boris!</div></div><div class=comment-level-0 id=comment-19236><div class=comment-header><a href=#comment-19236><img class=comment-avatar src="https://www.gravatar.com/avatar/4b25398c68e12f6dfd1b5628e70e3904?s=50"><p class=comment-info><strong>antonisbtr</strong><br><small>2020-01-16 18:09:52</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir!</p><p>The post really reasonated with me.
 I find more and more that I do engineering during my day than science.</p><p>I believe the data engineering part, including cloud and full stack development skills, will prove to be the skills that keep you relevant in industry. If you combine these with knowledge on which techniques to use regarding data science and machine learning, then you can be unstoppable.</p><p>Otherwise, as you said, it&rsquo;s better to stay in academia.</p><p>Best,
 Antonios</p></div></div><div class=comment-level-0 id=comment-23257><div class=comment-header><a href=#comment-23257><img class=comment-avatar src="https://www.gravatar.com/avatar/026d4693344dff7ed4b7e9605ed2121b?s=50"><p class=comment-info><strong>Audrey D</strong><br><small>2021-07-14 18:24:40</small></p></a></div><div class="comment-body post-content"><p>Hi Yanir,</p><p>I am glad I found your post. I am switching careers and want to work with data for social good. I learned data analysis, and am thinking of explore machine learning, see if that&rsquo;s something for me. Being an engineer (not tech related), I would definitely be content with your option 1, where I understand what&rsquo;s going on behind the scenes but don&rsquo;t get into research and maths.</p><p>Interesting to know what the trend is in the field.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
diff --git a/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/index.html b/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/index.html
index fe35f59cd..8a89268e9 100644
--- a/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/index.html
+++ b/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/index.html
@@ -44,7 +44,7 @@
 https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/revenue-confidence-intervals-with-bounds_hu2767cc513317d344ae2cdd478297ecf3_56731_1500x0_resize_box_3.png 1500w," src=https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/revenue-confidence-intervals-with-bounds_hu2767cc513317d344ae2cdd478297ecf3_56731_800x0_resize_box_3.png alt="Revenue confidence intervals with bounds" loading=lazy></a></figure><p><small>Notes: See <a href=https://github.com/yanirs/yanirs.github.io/blob/master/talks/bootstrapping-the-right-way/notebook-addendum.ipynb>this notebook</a> for code – use the same environment as <a href=https://github.com/yanirs/yanirs.github.io/blob/master/talks/bootstrapping-the-right-way/notebook.ipynb>the original notebook</a>. The cover photo is by <a href=https://www.pexels.com/photo/man-in-santa-hat-sitting-on-chair-counting-money-3480330/>Dima D from Pexels</a>.</small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/bootstrapping/>Bootstrapping</a></li><li><a href=https://yanirseroussi.com/tags/confidence-intervals/>Confidence Intervals</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on x" href="https://x.com/intent/tweet/?text=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f&amp;hashtags=bootstrapping%2cconfidenceintervals%2cdatascience%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f&amp;title=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way&amp;summary=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way&amp;source=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f&title=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on whatsapp" href="https://api.whatsapp.com/send?text=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way%20-%20https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on telegram" href="https://telegram.me/share/url?text=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way&amp;url=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Many is not enough: Counting simulations to bootstrap the right way on ycombinator" href="https://news.ycombinator.com/submitlink?t=Many%20is%20not%20enough%3a%20Counting%20simulations%20to%20bootstrap%20the%20right%20way&u=https%3a%2f%2fyanirseroussi.com%2f2020%2f08%2f24%2fmany-is-not-enough-counting-simulations-to-bootstrap-the-right-way%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2021/04/05/some-highlights-from-2020/index.html b/2021/04/05/some-highlights-from-2020/index.html
index 800fcc328..0b2cf2662 100644
--- a/2021/04/05/some-highlights-from-2020/index.html
+++ b/2021/04/05/some-highlights-from-2020/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/bougainville-reef-wall-dive_hufa91eac262d7ccfc888de175482140e1_5626271_1500x0_resize_q75_box.jpg 1500w," src=https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/bougainville-reef-wall-dive_hufa91eac262d7ccfc888de175482140e1_5626271_800x0_resize_q75_box.jpg alt="Bougainville reef wall dive" loading=lazy></a><figcaption><p>While data from RLS dives helps global conservation efforts, diving also reminds me that there&rsquo;s still so much left to save and conserve</p></figcaption></figure><p><strong>Reef Life Survey (RLS).</strong> Another distributed organisation that I&rsquo;m involved with, and a worthwhile cause, is <a href=https://reeflifesurvey.com/ target=_blank rel=noopener>the RLS foundation</a>. I previously posted about my experiences with <a href=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/>RLS offline data collection</a> and <a href=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/>visualisation of the collected data</a>, and have since helped with quite a few RLS surveys. Despite lockdowns and border closures, 2020 was no exception: I participated in <a href=https://reeflifesurvey.com/biennial-lord-howe-island-surveys-feb-2020/ target=_blank rel=noopener>the Lord Howe biennial surveys</a> in February (just before the initial lockdown), and was fortunate to join <a href=https://reeflifesurvey.com/airlie-beach-to-thursday-island-onboard-eviota-lap-of-aus/ target=_blank rel=noopener>a survey trip from Airlie Beach to Thursday Island</a> in October (long after lockdown lifted in the lucky state of Queensland). I also joined the 38(!) author list of <a href=https://www.sciencedirect.com/science/article/abs/pii/S0006320720309137 target=_blank rel=noopener><em>Establishing the ecological basis for conservation of shallow marine life using Reef Life Survey</em></a> – a Biological Conservation journal paper covering RLS&rsquo;s history, methodology, outcomes, and more. Finally, I was surprised and honoured to receive <a href=https://www.facebook.com/ReefLifeSurvey/posts/the-rlsf-agm-was-held-on-monday-so-we-can-officially-announce-this-years-scoresb/5361191510573757/ target=_blank rel=noopener>the Scoresby Shepherd Award</a> for doing the most RLS surveys in the 2019-20 financial year. It was clearly a bit of a slow year due to the pandemic, but it&rsquo;s always nice to get recognised. Overall, 2020 was definitely a good year for my participation in RLS and I&rsquo;m planning on contributing more in 2021, especially with help around organising and conducting surveys in Southeast Queensland.</p><p><strong>Technical work.</strong> My main &ldquo;day job&rdquo; focus in 2020 was on being the tech lead for Automattic&rsquo;s new experimentation platform (ExPlat). This aligns well with my long-standing interest in <a href=https://yanirseroussi.com/tags/causal-inference/>causal inference</a>. Among other things, it gave me an opportunity to apply <a href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/>my favourite approach to Bayesian A/B testing</a> in the wild, and get excited about other interesting causal inference work we have in the pipeline. Now that ExPlat&rsquo;s foundation is mostly in place, we are planning on sharing much of our work on data.blog. My colleague Aaron just published <a href=https://data.blog/2021/03/16/explat-automattics-experimentation-platform/ target=_blank rel=noopener>the first post in the series</a>, and my post on ExPlat&rsquo;s architecture will be next. Subscribe to data.blog to get updates!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li><li><a href=https://yanirseroussi.com/tags/split-testing/>Split Testing</a></li><li><a href=https://yanirseroussi.com/tags/sustainability/>Sustainability</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on x" href="https://x.com/intent/tweet/?text=Some%20highlights%20from%202020&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f&amp;hashtags=career%2ccausalinference%2cenvironment%2cReefLifeSurvey%2cremotework%2csplittesting%2csustainability"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f&amp;title=Some%20highlights%20from%202020&amp;summary=Some%20highlights%20from%202020&amp;source=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f&title=Some%20highlights%20from%202020"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on whatsapp" href="https://api.whatsapp.com/send?text=Some%20highlights%20from%202020%20-%20https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on telegram" href="https://telegram.me/share/url?text=Some%20highlights%20from%202020&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Some highlights from 2020 on ycombinator" href="https://news.ycombinator.com/submitlink?t=Some%20highlights%20from%202020&u=https%3a%2f%2fyanirseroussi.com%2f2021%2f04%2f05%2fsome-highlights-from-2020%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2021/10/07/my-work-with-automattic/index.html b/2021/10/07/my-work-with-automattic/index.html
index 2815a0a51..93c5760ac 100644
--- a/2021/10/07/my-work-with-automattic/index.html
+++ b/2021/10/07/my-work-with-automattic/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="Automattic,career,causal inference,data science,environment,machine learning,marketing,remote work,software engineering"><meta name=description content="Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I&rsquo;ve done with the company."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="My work with Automattic"><meta property="og:description" content="Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I&rsquo;ve done with the company."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2021/10/07/my-work-with-automattic/"><meta property="og:image" content="https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2021-10-07T00:00:00+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work.webp"><meta name=twitter:title content="My work with Automattic"><meta name=twitter:description content="Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I&rsquo;ve done with the company."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"My work with Automattic","item":"https://yanirseroussi.com/2021/10/07/my-work-with-automattic/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"My work with Automattic","name":"My work with Automattic","description":"Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I\u0026rsquo;ve done with the company.","keywords":["Automattic","career","causal inference","data science","environment","machine learning","marketing","remote work","software engineering"],"articleBody":"Automattic is the company behind WordPress.com, Tumblr, Jetpack, WooCommerce, and several other products. I worked with Automattic as a Type B Data Scientist (i.e., I mostly built and deployed code to production) from May 2017 to October 2021. This post is back-dated to my last day with the company to make it fit nicely into my post timeline, but I’m actually writing this in July 2023. The magic of time travel! 🪄\nA nice perk of working with Automattic was getting to write about my work on company blogs. When my website was on WordPress.com, I used the reblogging feature to share those posts here, but they never looked great. One of the first projects I completed after leaving Automattic was migrating my site from WordPress.com to Hugo, which made the reblog posts look even worse. Now all those reblogs redirect here, thanks to Hugo’s aliases feature.\nAnyway, here are some highlights from my Automattic work along with links to the relevant posts:\nLeading the build of a unified experimentation platform and spreading causal inference best practices throughout the organisation: ExPlat: Automattic’s Experimentation Platform (by Aaron Yan – Aaron was the team lead, and I was the tech lead for the project) Architecting ExPlat: Automattic’s New Experimentation Platform (by me) ExPlat’s Development Principles and Practices (by me) Co-developing pipe, a bespoke machine learning pipeline that was mostly used for marketing tasks when I was around (and is apparently still going strong in 2023 and beyond): Introducing pipe, The Automattic Machine Learning Pipeline (by Demet Dagdelen – pipe started as a two-person project that we worked on together) How to Increase Retention and Revenue in 1,000 Nontrivial Steps (by me) Building Thousands of Reproducible ML Models with pipe, the Automattic Machine Learning Pipeline (by Demet Dagdelen) Using ML for Campaign Optimization: Our Journey to Marketing Science at Automattic (by Demet Dagdelen) End-to-end implementation of automated customer chat tagging. My colleague Charles Earl published a post on the initial steps of the project around the time I joined the company. I helped get it to production shortly after I joined in 2017, once I was done with my first project that included improved measurement and presentation of key engagement metrics. In other words, I spent my first few months as an analytics engineer, then a few months as a machine learning engineer (classifications that were new or nonexistent back then). Encouraging the adoption of engineering best practices in data science projects. Hosting Cameron Davidson-Pilon for a chat and running internal book clubs and learning groups. Starting and co-leading an employee resource group to promote sustainability at Automattic, which resulted in carbon offsetting based on my research. On this website, you can also read about how I ended up joining Automattic and on some of the reasons behind my decision to leave the company.\n","wordCount":"471","inLanguage":"en","image":"https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work.webp","datePublished":"2021-10-07T00:00:00Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2021/10/07/my-work-with-automattic/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">My work with Automattic</h1><div class=post-meta><span title='2021-10-07 00:00:00 +0000 UTC'>October 7, 2021</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work_hu4e71d1d731e036718ec371daa39a901d_40330_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work_hu4e71d1d731e036718ec371daa39a901d_40330_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work.webp 512w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/bing-yanir-seroussi-automattic-work.webp alt="Bing thinks I looked like this while working at Automattic." width=512 height=481><p>Bing thinks I looked like this while working at Automattic.</p></figure><div class=post-content><p><a href=https://automattic.com/ target=_blank rel=noopener>Automattic</a> is the company behind WordPress.com, Tumblr, Jetpack, WooCommerce, and several other products. I worked with Automattic as a <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>Type B Data Scientist</a> (i.e., I mostly built and deployed code to production) from May 2017 to October 2021. This post is back-dated to my last day with the company to make it fit nicely into my post timeline, but I&rsquo;m actually writing this in July 2023. The magic of time travel! 🪄</p><p>A nice perk of working with Automattic was getting to write about my work on company blogs. When my website was on WordPress.com, I used the reblogging feature to share those posts here, but they never looked great. One of the first projects I completed after leaving Automattic was <a href=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/>migrating my site from WordPress.com to Hugo</a>, which made the reblog posts look even worse. Now all those reblogs redirect here, thanks <a href=https://gohugo.io/content-management/urls/#aliases target=_blank rel=noopener>to Hugo&rsquo;s aliases feature</a>.</p><p>Anyway, here are some highlights from my Automattic work along with links to the relevant posts:</p><ul><li>Leading the build of a unified experimentation platform and spreading causal inference best practices throughout the organisation:<ul><li><a href=https://data.blog/2021/03/16/explat-automattics-experimentation-platform/ target=_blank rel=noopener>ExPlat: Automattic&rsquo;s Experimentation Platform</a> (by Aaron Yan – Aaron was the team lead, and I was the tech lead for the project)</li><li><a href=https://data.blog/2021/04/14/architecting-explat-automattics-new-experimentation-platform/ target=_blank rel=noopener>Architecting ExPlat: Automattic&rsquo;s New Experimentation Platform</a> (by me)</li><li><a href=https://data.blog/2021/08/06/explats-development-principles-and-practices/ target=_blank rel=noopener>ExPlat&rsquo;s Development Principles and Practices</a> (by me)</li></ul></li><li>Co-developing pipe, a bespoke machine learning pipeline that was mostly used for marketing tasks when I was around (and is apparently still going strong in 2023 and beyond):<ul><li><a href=https://data.blog/2018/11/15/introducing-pipe-the-automattic-machine-learning-pipeline/ target=_blank rel=noopener>Introducing pipe, The Automattic Machine Learning Pipeline</a> (by Demet Dagdelen – pipe started as a two-person project that we worked on together)</li><li><a href=https://data.blog/2019/01/15/how-to-increase-retention-and-revenue-in-1000-nontrivial-steps/ target=_blank rel=noopener>How to Increase Retention and Revenue in 1,000 Nontrivial Steps</a> (by me)</li><li><a href=https://data.blog/2019/01/08/building-thousands-of-reproducible-ml-models-with-pipe-the-automattic-machine-learning-pipeline/ target=_blank rel=noopener>Building Thousands of Reproducible ML Models with pipe, the Automattic Machine Learning Pipeline</a> (by Demet Dagdelen)</li><li><a href=https://data.blog/2019/06/10/using-ml-for-campaign-optimization-our-journey-to-marketing-science-at-automattic/ target=_blank rel=noopener>Using ML for Campaign Optimization: Our Journey to Marketing Science at Automattic</a> (by Demet Dagdelen)</li></ul></li><li>End-to-end implementation of automated customer chat tagging. My colleague Charles Earl published <a href=https://data.blog/2017/05/24/may-the-bot-be-with-you-how-algorithms-are-supporting-happiness-at-wordpress-com/ target=_blank rel=noopener>a post on the initial steps of the project</a> around the time I joined the company. I helped get it to production shortly after I joined in 2017, once I was done with my first project that included improved measurement and presentation of key engagement metrics. In other words, I spent my first few months as an analytics engineer, then a few months as a machine learning engineer (classifications that were new or nonexistent back then).</li><li><a href=https://data.blog/2018/03/20/engineering-data-science-at-automattic/ target=_blank rel=noopener>Encouraging the adoption of engineering best practices in data science projects</a>.</li><li><a href=https://data.blog/2019/05/23/data-science-insights-from-cameron-davidson-pilon/ target=_blank rel=noopener>Hosting Cameron Davidson-Pilon for a chat</a> and running internal book clubs and learning groups.</li><li><a href=https://wordpress.com/blog/2020/09/21/toward-zero-reducing-and-offsetting-our-data-center-power-emissions/ target=_blank rel=noopener>Starting and co-leading an employee resource group to promote sustainability at Automattic, which resulted in carbon offsetting based on my research</a>.</li></ul><p>On this website, you can also read about <a href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/>how I ended up joining Automattic</a> and on <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>some of the reasons behind my decision to leave the company</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/automattic/>Automattic</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on x" href="https://x.com/intent/tweet/?text=My%20work%20with%20Automattic&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f&amp;hashtags=Automattic%2ccareer%2ccausalinference%2cdatascience%2cenvironment%2cmachinelearning%2cmarketing%2cremotework%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f&amp;title=My%20work%20with%20Automattic&amp;summary=My%20work%20with%20Automattic&amp;source=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f&title=My%20work%20with%20Automattic"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on whatsapp" href="https://api.whatsapp.com/send?text=My%20work%20with%20Automattic%20-%20https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on telegram" href="https://telegram.me/share/url?text=My%20work%20with%20Automattic&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My work with Automattic on ycombinator" href="https://news.ycombinator.com/submitlink?t=My%20work%20with%20Automattic&u=https%3a%2f%2fyanirseroussi.com%2f2021%2f10%2f07%2fmy-work-with-automattic%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/index.html b/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/index.html
index 0c99d87fe..1369c9384 100644
--- a/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/index.html
+++ b/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/index.html
@@ -8,7 +8,7 @@
 100vw" srcset="https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/before-and-after.png 640w," src=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/before-and-after.png alt="Before and after look of a recent post" loading=lazy></a><figcaption><p>Before and after look of a recent post</p></figcaption></figure><p>As I was eager to finish the initial migration, I avoided spending too much time on non-critical tasks. These include <a href=https://moz.com/beginners-guide-to-seo target=_blank rel=noopener>following all the SEO best practices</a>, <a href="https://developers.google.com/speed/pagespeed/insights/?url=yanirseroussi.com" target=_blank rel=noopener>increasing page speed</a>, applying various style tweaks, and other small changes. With more control over my site, I now have the power to incrementally address such tasks over time.</p><p>In summary, I found the migration rewarding and educational. It was also fun to go through old posts and get motivated to publish more frequently. I&rsquo;m looking forward to shifting my focus to the content – stay tuned for new posts!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/cloudflare/>Cloudflare</a></li><li><a href=https://yanirseroussi.com/tags/github/>GitHub</a></li><li><a href=https://yanirseroussi.com/tags/hugo/>Hugo</a></li><li><a href=https://yanirseroussi.com/tags/sustainability/>Sustainability</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li><li><a href=https://yanirseroussi.com/tags/wordpress/>WordPress</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on x" href="https://x.com/intent/tweet/?text=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f&amp;hashtags=Cloudflare%2cGitHub%2cHugo%2csustainability%2cwebdevelopment%2cWordPress"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f&amp;title=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare&amp;summary=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare&amp;source=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f&title=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on whatsapp" href="https://api.whatsapp.com/send?text=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare%20-%20https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on telegram" href="https://telegram.me/share/url?text=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Migrating from WordPress.com to Hugo on GitHub + Cloudflare on ycombinator" href="https://news.ycombinator.com/submitlink?t=Migrating%20from%20WordPress.com%20to%20Hugo%20on%20GitHub%20%2b%20Cloudflare&u=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f10%2fmigrating-from-wordpress-com-to-hugo-on-github-cloudflare%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/index.html b/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/index.html
index b40dbefd3..116ac01a5 100644
--- a/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/index.html
+++ b/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/dog-philosophy.png 1024w," src=https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/dog-philosophy_huc3cf4931803219559fd375fea6b748c2_70831_800x0_resize_box_3.png alt="Dog Philosophy by Three Panel Soul" loading=lazy></a><figcaption><p>Source: <a href=http://www.threepanelsoul.com/comic/dog-philosophy target=_blank rel=noopener>Three Panel Soul - dog philosophy</a></p></figcaption></figure><p>Of course, what constitutes <em>good</em> is an open question, which I touched on in the talk. Other key points include:</p><ul><li>The modelling context is much broader than any machine learning model. Considering context is where human brains shine.</li><li>Thoughtlessness can have a negative impact on society and on your career.</li><li>Moral values vary across time, space, cultures, and individuals, e.g., along <a href=https://moralfoundations.org/ target=_blank rel=noopener>five moral foundations</a>.</li><li>Any data scientist, machine learning engineer, or modern human should develop their critical thinking skills. <a href=https://www.callingbullshit.org/ target=_blank rel=noopener>The Calling Bullshit course</a> from the University of Washington is a great starting point – essentially Data Literacy 101.</li><li>Bullshit is easier to detect than call. Deciding on a level of bullshit calling is like tuning a model&rsquo;s learning rate.</li></ul><p>A good chunk of the talk was spent on <a href=https://www.callingbullshit.org/case_studies/case_study_criminal_machine_learning.html target=_blank rel=noopener>the case study on criminal machine learning from the Calling Bullshit website</a>. I was pleased with the level of engagement on this segment, especially since a lockdown forced us to deliver the class online at short notice. You can watch the full talk below (my part ends after 24 minutes), view the slides <a href=https://docs.google.com/presentation/d/1vi0YKxmevanE8zA6u2ZuA835boSXKMa-Su8LZmLA7EA/edit target=_blank rel=noopener>here</a>, and check out <a href=https://github.com/michaeltremeer/queensland-ai-fastai-course-resources target=_blank rel=noopener>supplementary materials from all mentors on GitHub</a>.</p><p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/P1ebqJ4ZIEI?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="YouTube video"></iframe></div></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/ethics/>Ethics</a></li><li><a href=https://yanirseroussi.com/tags/fast.ai/>Fast.ai</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on x" href="https://x.com/intent/tweet/?text=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f&amp;hashtags=artificialintelligence%2cdatascience%2cdeeplearning%2cethics%2cfast.ai%2cmachinelearning"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f&amp;title=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters&amp;summary=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters&amp;source=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f&title=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on whatsapp" href="https://api.whatsapp.com/send?text=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters%20-%20https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on telegram" href="https://telegram.me/share/url?text=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters&amp;url=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Use your human brain to avoid artificial intelligence disasters on ycombinator" href="https://news.ycombinator.com/submitlink?t=Use%20your%20human%20brain%20to%20avoid%20artificial%20intelligence%20disasters&u=https%3a%2f%2fyanirseroussi.com%2f2021%2f11%2f22%2fuse-your-human-brain-to-avoid-artificial-intelligence-disasters%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2022/01/14/analysis-strategies-in-online-a-b-experiments/index.html b/2022/01/14/analysis-strategies-in-online-a-b-experiments/index.html
index 74b18971d..1a5ce0575 100644
--- a/2022/01/14/analysis-strategies-in-online-a-b-experiments/index.html
+++ b/2022/01/14/analysis-strategies-in-online-a-b-experiments/index.html
@@ -4,7 +4,7 @@
 <a class=comment-button href=# onclick='alert("This is variant B: treatment")' style=float:unset>sign up today!</a><figcaption><p>A simplified mockup of the variants. Which one would you choose?</p></figcaption></figure><p>Placing this scenario into the above diagram, if we were to simply change the text, i.e., apply the <code>Treatment</code> to everyone, we wouldn&rsquo;t be able to confidently tell whether the text change was the <em>cause</em> of any observed difference in the conversion rate. For example, if our release coincided with a surge of interest in cryptocurrency, this surge may be one of the <code>Unknowns</code> that would cause more motivated users to come to our exchange and sign up. That is, the surge would affect both exposure to the <code>Treatment</code> and the <code>Outcome</code>.</p><p>When we run an ideal A/B experiment, we don&rsquo;t have this problem. Factors like a surge of interest in crypto don&rsquo;t affect the assignment of users to the control group A (<em>&ldquo;sign up&rdquo;</em>) and the treatment group B (<em>&ldquo;sign up today!&rdquo;</em>). We can compare the conversion rates across the groups, estimate random variability with <a href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/>our favourite A/B testing calculator</a>, and rejoice. Right?</p><p>Well, not so fast&mldr;</p><h2 id=problems-problems>Problems, problems&mldr;<a hidden class=anchor aria-hidden=true href=#problems-problems>#</a></h2><p>In the ideal scenario, all the users that were assigned to one of the experiment groups experience their assigned variant and produce a measurable outcome. In our running example, the groups are <code>A: control</code> and <code>B: treatment</code> with a simple exposure of seeing <em>&ldquo;sign up&rdquo;</em> for the former and <em>&ldquo;sign up today!&rdquo;</em> for the latter. The outcome is a successful signup or an absence of a signup. To make the outcome well-defined, it&rsquo;s often a good idea to limit outcome measurement to events that happen (or don&rsquo;t happen) within a reasonable <em>attribution window</em> from exposure or assignment. In our example, a reasonable attribution window is probably on the order of hours, as we don&rsquo;t expect the call-to-action text to have long-lasting effects.</p><p>Potential deviations from the ideal scenario include:</p><ul><li><strong>Assignment of ineligible users.</strong> In our running example, these may be bots or users that already have an account. If we include many ineligible users in our analysis, we may underestimate the effect size even if their distribution across groups is uniform.</li><li><strong>Crossovers.</strong> These are users that manage to experience both variants. For example, they may come across our site on mobile with the <em>&ldquo;sign up today!&rdquo;</em> text, and then switch to desktop and see the <em>&ldquo;sign up&rdquo;</em> message. Depending on the instrumentation we have in place, we may not be able to detect such users, or we may only detect them if they sign up on one device and then log in on the other device.</li><li><strong>Assignment without exposure.</strong> Due to implementation constraints, we may not be guaranteed that assigned users are actually exposed to the treatment and control. In our running example, it may be that the assignment is done on the backend while exposure happens conditionally and asynchronously on the frontend – some users may bounce in the gap between assignment and exposure, and never see the call-to-action text.</li><li><strong>Multiple exposures.</strong> Once a user has been assigned, they may get exposed to the treatment and control multiple times (without crossing over). In our example, they may visit the landing page repeatedly and see the <em>&ldquo;sign up&rdquo;</em> or <em>&ldquo;sign up today!&rdquo;</em> text multiple times before deciding to sign up.</li></ul><h2 id=epidemiologist-jargon-and-analysis-strategies>Epidemiologist jargon and analysis strategies<a hidden class=anchor aria-hidden=true href=#epidemiologist-jargon-and-analysis-strategies>#</a></h2><p>While clinical trials are more tightly controlled than online A/B experiments, they are also susceptible to problems like assignment of ineligible patients and non-adherence to treatment (e.g., crossover, non-exposure, and multiple exposures). Hence, much has been written on addressing these problems at the analysis stage. However, when researching the topic, overcoming the domain-specific language barrier was a bit of a challenge, as the terminology used by online experimenters is different from the terminology used by epidemiologists. Fortunately, I came across the term <em>intention-to-treat</em> at some point, which opened the door to decades of research on the topic.</p><p>Two papers I found useful are <a href=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3159210/ target=_blank rel=noopener><em>Intention-to-treat concept: A review</em></a> (Gupta, 2011) and <a href=https://arxiv.org/abs/1911.06030 target=_blank rel=noopener><em>Guidelines for estimating causal effects in pragmatic randomized trials</em></a> (Murray, Swanson, and Hernán, 2019). Seeing <a href=https://www.hsph.harvard.edu/miguel-hernan/ target=_blank rel=noopener>Miguel Hernán</a> on the author list was an especially positive signal for me, as he is responsible for <a href=https://yanirseroussi.com/causal-inference-resources/>some of my favourite resources on causal inference</a>, including <a href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/>the most practical book I&rsquo;ve read on the topic</a>.</p><p>The definitions and guidelines from these two papers provide a solid foundation for thinking about problems of ineligibility and non-adherence. Specifically, Gupta defines intention-to-treat as an analysis strategy <em>&ldquo;that includes all randomized patients in the groups to which they were randomly assigned, regardless of their adherence with the entry criteria, regardless of the treatment they actually received, and regardless of subsequent withdrawal from treatment or deviation from the protocol.&rdquo;</em></p><p>There are often good reasons to exclude some randomised participants from analysis. Depending on the exclusions, this may or may not bias the results. The use of conservative exclusions can be described as modified intention-to-treat, which according to Gupta <em>&ldquo;allows the exclusion of some randomized subjects in a justified way (such as patients who were deemed ineligible after randomization or certain patients who never started treatment). However, the definition given to the modified ITT (mITT) in randomized controlled trials has been found to be irregular and arbitrary because there is a lack of consistent guidelines for its application. The mITT analysis allows a subjective approach in entry criteria, which may lead to confusion, inaccurate results and bias.&rdquo;</em></p><p>Exclusions and further adjustments are usually an attempt to estimate the per-protocol effect, which is defined by Murray, Swanson, and Hernán as <em>&ldquo;the effect of receiving the assigned treatment strategies throughout the follow-up as specified in the study protocol.&rdquo;</em> Unfortunately, obtaining a valid estimate of the per-protocol effect isn&rsquo;t trivial: <em>&ldquo;To validly estimate the per-protocol effect, baseline variables which predict adherence and are prognostic for the outcome need to be accounted for, either through direct adjustment or via an instrumental variable analysis. Yet two commonly used analytic approaches do not incorporate any such adjustment: (1) Naïve per-protocol analysis, that is, restricting the analytic subset to adherent individuals; and (2) As-treated analysis, that is, comparing individuals based on the treatment they choose.&rdquo;</em> In other words, if we&rsquo;re not careful, the per-protocol analysis may become analogous to an uncontrolled experiment, as depicted at the top of the diagram above.</p><h2 id=what-should-be-done-in-practice>What should be done in practice?<a hidden class=anchor aria-hidden=true href=#what-should-be-done-in-practice>#</a></h2><p>From my reading of the clinical trial literature, the tendency is to use multiple analysis strategies. For example, the first guideline noted by Murray, Swanson, and Hernán is: <em>&ldquo;To adequately guide decision making by all stakeholders, report estimates of both the intention-to-treat effect and the per-protocol effect, as well as methods and key conditions underlying the estimation procedures.&rdquo;</em> This echoes <a href=https://www.fda.gov/regulatory-information/search-fda-guidance-documents/format-and-content-clinical-and-statistical-sections-application target=_blank rel=noopener>the 1988 US FDA guidelines</a> that require applicants to provide an intention-to-treat analysis in addition to the applicant&rsquo;s preferred per-protocol analyses. Similarly, <a href=https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e-9-statistical-principles-clinical-trials-step-5_en.pdf target=_blank rel=noopener>the 1998 European Medicines Agency guidelines</a> provide more details on the intention-to-treat, modified intention-to-treat, and per-protocol strategies, stating that: <em>&ldquo;In general, it is advantageous to demonstrate a lack of sensitivity of the principal trial results to alternative choices of the set of subjects analysed. [&mldr;] When the full analysis set and the per protocol set lead to essentially the same conclusions, confidence in the trial results is increased, bearing in mind, however, that the need to exclude a substantial proportion of subjects from the per protocol analysis throws some doubt on the overall validity of the trial.&rdquo;</em></p><p>While the stakes in online experiments are typically much lower than in human drug approval, I believe that applying multiple analysis strategies is still a great idea. We did that for Automattic&rsquo;s experimentation platform, where we flagged discrepancies between the strategies if they led to conflicting conclusions. One downside of this approach is that it complicates the presentation of results in comparison to using a single strategy. If you face the same challenge, you may draw inspiration from seeing how it&rsquo;s addressed by the <a href=https://github.com/Automattic/abacus target=_blank rel=noopener>open source frontend of Automattic&rsquo;s experimentation platform</a>.</p><p>Going back to our running example, we can perform the following analyses to deal with the deviations noted above:</p><ul><li><strong>Intention-to-treat.</strong> Includes all users based on their initial group assignment, regardless of what variant they were exposed to.</li><li><strong>Modified intention-to-treat: No ineligible users.</strong> This applies to cases where we detect the ineligibility after assignment, but the eligibility criteria are based on factors that could have been known before the experiment. Hence, it <em>should</em> be safe to exclude the ineligible users after the fact. In our example, excluding bots and existing users should increase the observed effect size, but not change the preferred variant.</li><li><strong>Modified intention-to-treat: No crossovers.</strong> If we have a mechanism to detect <em>some</em> crossovers, excluding them and comparing the results to the intention-to-treat analysis may uncover implementation bugs. It&rsquo;s worth noting that crossovers shouldn&rsquo;t occur in cases where we can uniquely identify users at all stages of the experiment – it is a problem that is more likely to occur when dealing with anonymous users, as in our landing page example. As such, and given the inability to detect all crossovers, A/B experiments should be avoided when users are highly motivated to cross over. For example, displaying different price levels based on anonymous and transient identifiers like cookies is often a bad idea.</li><li><strong>Naive per-protocol: Exposed users.</strong> For this analysis, we&rsquo;d only include users that were exposed to the control and treatment texts. As noted by Murray, Swanson, and Hernán, this is naive because we <em>should</em> adjust our estimates based on variables that predict exposure. However, if missing exposures are only due to the inherent limitations of online experiments, this falls more under the modified intention-to-treat criterion noted by Gupta, of excluding <em>&ldquo;patients who never started treatment&rdquo;</em>. Things get more complicated if we wish to use each exposure as a distinct starting point for measuring multiple assignment windows (the <em>multiple exposures</em> scenario above), which is akin to patients choosing their own dosage – far from a controlled experiment. For automated analysis, it&rsquo;s better to use the first exposure as the attribution window start, as it should be unaffected by the experiment variants.</li></ul><p>For all analysis approaches, it&rsquo;s critical to verify that there is no <em>sample ratio mismatch</em> in the analysed population, i.e., that the distribution of users across variants matches what we expect from a random assignment. If this isn&rsquo;t the case, manual analysis by a qualified data scientist is needed. The result of this manual analysis may be that the results should be discarded, as sample ratio mismatches are a common indicator of implementation bugs. This is discussed in detail in the book <a href=https://experimentguide.com/ target=_blank rel=noopener><em>Trustworthy Online Controlled Experiments</em></a>, which also includes a chapter on exposure-based analysis (called <em>triggering</em> in the book). Among other recommendations, the authors suggest analysing the <em>unexposed</em> users. If everything goes as expected, metrics for the assigned-but-unexposed populations would behave like A/A experiment metrics, i.e., any differences between the groups should be due to random variability.</p><p>Having rigorous consistency checks in place and falling back to manual analysis when any discrepancies are detected should help avoid the pitfalls of unsafe user exclusions that&rsquo;d bias the results. Given the need for careful adjustments to get a valid per-protocol estimate in case anything goes wrong, it is often best to fix any underlying issues and rerun the experiment. Usually, this is much cheaper to do in an online setting than in clinical trials.</p><h2 id=closing-thoughts-and-further-reading>Closing thoughts and further reading<a hidden class=anchor aria-hidden=true href=#closing-thoughts-and-further-reading>#</a></h2><p>Once you move from the theory of experimentation to the practice of running experiments in the real world, you discover the many complexities involved in doing it well. This applies whether you&rsquo;re an epidemiologist or an online experimenter. As noted in the preface to the trustworthy experiments book: <em>&ldquo;Getting numbers is easy; getting numbers you can trust is hard!&rdquo;</em></p><p>This post only scratched the surface of one area of experimentation: Deciding what population to analyse once the experiment was run. There is, of course, a lot more to online experimentation and causal inference than what I could cover here. But I hope that this message is clear: <strong>Approach experimentation with humility, and aim to learn from a broad set of teachers rather than limit yourself to the relatively-recent developments in online experiments.</strong></p><p>As mentioned above, some resources that are worth reading to learn more include <a href=https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/ target=_blank rel=noopener>my favourite causal inference book</a>, <a href=https://experimentguide.com/ target=_blank rel=noopener>the trustworthy experiments book</a>, and <a href=https://arxiv.org/abs/1911.06030 target=_blank rel=noopener>the guidelines for pragmatic trials</a>. There are also a bunch of resources on <a href=https://yanirseroussi.com/causal-inference-resources/>my causal inference list</a>, and <a href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/>my post on Bayesian A/B testing</a> should be of interest if you made it to this point. Finally, I&rsquo;m always happy to discuss these topics, so feel free to <a href=https://yanirseroussi.com/about/>contact me</a> or leave a comment with your thoughts.</p><hr><p><small>Cover image by <a href=https://pixabay.com/photos/online-pharmacy-pills-click-3962209/ target=_blank rel=noopener>Tumisu from Pixabay</a></small></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/split-testing/>Split Testing</a></li><li><a href=https://yanirseroussi.com/tags/statistics/>Statistics</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on x" href="https://x.com/intent/tweet/?text=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f&amp;hashtags=causalinference%2cdatascience%2cmarketing%2csplittesting%2cstatistics"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f&amp;title=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials&amp;summary=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials&amp;source=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f&title=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on whatsapp" href="https://api.whatsapp.com/send?text=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials%20-%20https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on telegram" href="https://telegram.me/share/url?text=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials on ycombinator" href="https://news.ycombinator.com/submitlink?t=Analysis%20strategies%20in%20online%20A%2fB%20experiments%3a%20Intention-to-treat%2c%20per-protocol%2c%20and%20other%20lessons%20from%20clinical%20trials&u=https%3a%2f%2fyanirseroussi.com%2f2022%2f01%2f14%2fanalysis-strategies-in-online-a-b-experiments%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/index.html b/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/index.html
index 38cda9ec6..929d92e7d 100644
--- a/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/index.html
+++ b/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/index.html
@@ -6,7 +6,7 @@
 https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/real-programmers-xkcd.png 740w," src=https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/real-programmers-xkcd.png alt="xkcd: Real Programmers" loading=lazy></a><figcaption><p>Real Programmers don&rsquo;t do easy things. Source: <a href=https://xkcd.com/378/ target=_blank rel=noopener>xkcd</a>.</p></figcaption></figure><p>That said, the fast.ai library isn&rsquo;t perfect. Debugging can be a bit frustrating, as it tries to do a lot of things automatically and mutates many objects in surprising ways. Its documentation is also somewhat lacking (perhaps due to <a href=https://www.fast.ai/2019/12/02/nbdev/ target=_blank rel=noopener>the use of notebooks as the primary development environment</a>), and its naming conventions can be a bit odd (especially the overuse of acronyms). But these are minor annoyances rather than blockers. It does work well for its main use cases, and it&rsquo;s possible to go down to the PyTorch level when necessary.</p><p><strong>5. As always, it&rsquo;s all about the data and how you use it.</strong> This is hardly a new lesson for me, but it&rsquo;s worth reiterating. Given the maturity of computer vision and other machine learning packages, data scientists should focus on getting relevant data and understanding the problem well. As Andrej Karpathy noted in <a href=https://karpathy.github.io/2019/04/25/recipe/ target=_blank rel=noopener>his 2019 recipe for training neural nets</a>, and I said in <a href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/>my 2014 Kaggle tips</a>, you should aim to <em>become one with the data</em>.</p><p><strong>6. FOMO will always be there, but it can be lessened.</strong> In general, I care more about making useful things than about using the latest techniques. This is why I prioritised working with RLS to get my tool deployed. Still, <a href=https://www.oreilly.com/library/view/the-care-and/9781492053972/ch04.html target=_blank rel=noopener>FOMO in data science is a well-documented phenomenon</a>, and I suffer from it too. It&rsquo;s encouraging that – given some free time and a clear head – it&rsquo;s not that hard to catch up on recent developments. This is made especially easy by the availability of many free resources, like fast.ai. The main thing to remember is to focus on principles rather than worry about the million methods and tools that are out there – <a href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/#tool-recommendation>it was true in 1911, and it&rsquo;s still true today</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/deep-learning/>Deep Learning</a></li><li><a href=https://yanirseroussi.com/tags/fast.ai/>Fast.ai</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on x" href="https://x.com/intent/tweet/?text=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f&amp;hashtags=artificialintelligence%2cdatascience%2cdeeplearning%2cfast.ai%2cmachinelearning%2cmarinescience%2cReefLifeSurvey%2csoftwareengineering%2cwebdevelopment"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f&amp;title=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study&amp;summary=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study&amp;source=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f&title=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on whatsapp" href="https://api.whatsapp.com/send?text=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study%20-%20https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on telegram" href="https://telegram.me/share/url?text=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building useful machine learning tools keeps getting easier: A fish ID case study on ycombinator" href="https://news.ycombinator.com/submitlink?t=Building%20useful%20machine%20learning%20tools%20keeps%20getting%20easier%3a%20A%20fish%20ID%20case%20study&u=https%3a%2f%2fyanirseroussi.com%2f2022%2f03%2f20%2fbuilding-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1><div class=comment-header><a href=#comment-1><img class=comment-avatar src="https://avatars.githubusercontent.com/u/29480389?s=50&v=4"><p class=comment-info><strong>Ishan Nangia</strong><br><small>2023-06-14 10:51:18</small></p></a></div><div class="comment-body post-content">Loved this short blog. Planning to transition to Climate tech as a DS guy and am slowly cultivating a pent-up passion for conserving the marine life so too many things that I can relate to here haha</div></div><div class=comment-level-1 id=comment-2><div class=comment-header><a href=#comment-2><img class=comment-avatar src="https://avatars.githubusercontent.com/u/3952615?s=50&v=4"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2023-06-15 21:46:03</small></p></a></div><div class="comment-body post-content">Thank you! Good luck with the transition. 🙂</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/index.html b/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/index.html
index de1812e05..25e67ebee 100644
--- a/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/index.html
+++ b/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/index.html
@@ -11,7 +11,7 @@
 https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/rectangular-nightmare.png 1158w," src=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/rectangular-nightmare_hua7b55b1a9144008aca94017bec6c6b89_1030812_800x0_resize_box_3.png alt="Tweet by Jesse Zook Mann: The jury is still out of what is causing depression to surge in the past decade. Showing a meme that says that 'after a nice long day of typing stuff into my nightmare rectangle at work, I like to come home and stare at my portable nightmare rectangle'." loading=lazy></a><figcaption><p>Rectangles are useful, but we also need time without them.</p></figcaption></figure><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>In addition to the mission, Automattic CEO Matt Mullenweg has shared <a href=https://ma.tt/2021/06/day-one-at-automattic/ target=_blank rel=noopener>his vision of making Automattic the Berkshire Hathaway of the internet</a>, a goal that I find even less inspiring.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:2><p>While I was aware of the mission when I joined Automattic in 2017, it <a href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/>wasn&rsquo;t a critical criterion for me</a>. Over the years, I&rsquo;ve become more conscious of the role online platforms play in destabilising societies. I now believe that it&rsquo;s important for platforms to acknowledge their responsibilities and delegate power to external regulators, e.g., as <a href=https://oversightboard.com/ target=_blank rel=noopener>Facebook is doing with their Oversight Board</a> (which is still an imperfect solution).&#160;<a href=#fnref:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:3><p>It&rsquo;s also an open question whether it&rsquo;s possible to offset things like the harmful work of the Murdoch press.&#160;<a href=#fnref:3 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:4><p>I still like the 2018 definition, so hopefully I&rsquo;m done with defining data science.&#160;<a href=#fnref:4 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:5><p>According to <a href=https://www.abc.net.au/news/2022-05-21/which-jobs-and-hobbies-are-the-most-boring-surprising-results/101050190 target=_blank rel=noopener>a recent study</a>, data science is seen as an incredibly boring job. Not sexy at all.&#160;<a href=#fnref:5 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:6><p>Despite this, I wasn&rsquo;t particularly looking forward to going back to frequent long-haul flights – <a href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/>it was an aspect of Automattic work I never liked</a>. This made the prospect of post-pandemic work with Automattic less appealing, even without considering the climate impact of so much flying.&#160;<a href=#fnref:6 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/automattic/>Automattic</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/climate-change/>Climate Change</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/orkestra/>Orkestra</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/politics/>Politics</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li><li><a href=https://yanirseroussi.com/tags/sustainability/>Sustainability</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on x" href="https://x.com/intent/tweet/?text=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f&amp;hashtags=Automattic%2ccareer%2cclimatechange%2cdatascience%2cenvironment%2cOrkestra%2cpersonal%2cpolitics%2cremotework%2csustainability"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f&amp;title=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist&amp;summary=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist&amp;source=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f&title=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on whatsapp" href="https://api.whatsapp.com/send?text=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist%20-%20https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on telegram" href="https://telegram.me/share/url?text=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The mission matters: Moving to climate tech as a data scientist on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20mission%20matters%3a%20Moving%20to%20climate%20tech%20as%20a%20data%20scientist&u=https%3a%2f%2fyanirseroussi.com%2f2022%2f06%2f06%2fthe-mission-matters-moving-to-climate-tech-as-a-data-scientist%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1><div class=comment-header><a href=#comment-1><img class=comment-avatar src="https://avatars.githubusercontent.com/u/971886?s=50&v=4"><p class=comment-info><strong>Jessie Ross</strong><br><small>2022-06-06 07:55:44</small></p></a></div><div class="comment-body post-content">This is a very well written post and it’s great to hear your reasoning on this. Congrats on your new position!</div></div><div class=comment-level-1 id=comment-2><div class=comment-header><a href=#comment-2><img class=comment-avatar src="https://avatars.githubusercontent.com/u/3952615?s=50&v=4"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2022-06-06 21:26:58</small></p></a></div><div class="comment-body post-content">Thank you Jessie!</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2022/09/12/causal-machine-learning-book-draft-review/index.html b/2022/09/12/causal-machine-learning-book-draft-review/index.html
index 9b7d5caa5..63d86387d 100644
--- a/2022/09/12/causal-machine-learning-book-draft-review/index.html
+++ b/2022/09/12/causal-machine-learning-book-draft-review/index.html
@@ -8,7 +8,7 @@
 Source: Wikipedia – ABO blood group system (retrieved on 2022-09-11)." loading=lazy></a><figcaption><p>When observing parent phenotypes (ABO blood types) without genotypes, grandparent phenotypes are informative.<br>Source: <a href=https://en.wikipedia.org/wiki/ABO_blood_group_system#Genetics target=_blank rel=noopener>Wikipedia – ABO blood group system</a> (retrieved on 2022-09-11).</p></figcaption></figure><p>I also struggle with overly-casual statements like this one:</p><blockquote><p>Suppose we were interested in modeling the relationship between altitude and temperature. The two are clearly correlated; the higher up you go, the colder it gets. However, you know temperature doesn&rsquo;t cause altitude, otherwise heating the air within a city would cause the city to fly. Altitude is the cause, and temperature is the effect.</p></blockquote><p>In fact, heating the air within a city <em>would</em> cause the heated air to rise. And extremely high heat can melt a city and the land it&rsquo;s on, thereby causing a reduction in its altitude.</p><p>While this may seem like nitpicking, ill-defined causal graphs are a serious problem. One of my favourite papers on the topic is <a href=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5207342/ target=_blank rel=noopener><em>Does water kill? A call for less casual causal inferences</em></a>, which argues that <em>"[while] it is impossible to provide an absolutely precise definition of a version of treatment [&mldr;] specification of versions of treatment is required only until no meaningful vagueness remains"</em>. However, <em>&ldquo;declaring a version of treatment sufficiently well-defined is a matter of agreement among experts based on the available substantive knowledge&rdquo;</em> because we don&rsquo;t have an objective way of determining that treatments are well-defined. In line with this thinking, the book may benefit from reducing the variety of examples in favour of a handful of small datasets that are more well-defined and defensible.</p><p>Despite these shortcomings, I found chapters 1-3 of <em>Causal Machine Learning</em> pleasant enough to get through, and I look forward to reading more. Getting into DoWhy and other related packages has been on my list, and I&rsquo;m sure I&rsquo;ll learn a lot by following the MEAP. After tracking the field for almost a decade and complaining about <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/>the relative hype levels of deep learning and causal inference</a>, it&rsquo;s great to see a practical book that aims to marry the two. <a href=https://arxiv.org/abs/1801.04016 target=_blank rel=noopener>The Causal Revolution</a> is truly upon us.</p><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>It is almost inevitable that when pointing out the mistakes of others I will make mistakes myself. I apologise for any mistakes and welcome feedback.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/causal-inference/>Causal Inference</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on x" href="https://x.com/intent/tweet/?text=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f&amp;hashtags=artificialintelligence%2ccausalinference%2cdatascience%2cmachinelearning"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f&amp;title=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues&amp;summary=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues&amp;source=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f&title=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on whatsapp" href="https://api.whatsapp.com/send?text=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues%20-%20https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on telegram" href="https://telegram.me/share/url?text=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal Machine Learning is off to a good start, despite some issues on ycombinator" href="https://news.ycombinator.com/submitlink?t=Causal%20Machine%20Learning%20is%20off%20to%20a%20good%20start%2c%20despite%20some%20issues&u=https%3a%2f%2fyanirseroussi.com%2f2022%2f09%2f12%2fcausal-machine-learning-book-draft-review%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2022/12/11/chatgpt-is-transformative-ai/index.html b/2022/12/11/chatgpt-is-transformative-ai/index.html
index 1ba385823..e19e2bf17 100644
--- a/2022/12/11/chatgpt-is-transformative-ai/index.html
+++ b/2022/12/11/chatgpt-is-transformative-ai/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/chat-gpt-is-transformative-ai.png 1450w," src=https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/chat-gpt-is-transformative-ai_hu2a6554fee21e5adc8b4e5b4193592a03_183217_800x0_resize_box_3.png alt="ChatGPT Prompt: 'Write a paragraph on the topic &#34;ChatGPT is transformative AI&#34;.' Response: 'ChatGPT is a transformative AI that has the potential to change the way we interact with technology. It is a powerful language model that can understand and generate human-like text, allowing it to engage in natural language conversations with people. This technology has the potential to revolutionize many fields, including customer service, education, and even personal communication. With ChatGPT, we can have more natural and intuitive interactions with AI, making it easier for us to access information and get things done. In short, ChatGPT represents a major advancement in AI technology and has the potential to transform how we live and work.'" loading=lazy></a><figcaption><p>ChatGPT in its own words</p></figcaption></figure></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/futurism/>Futurism</a></li><li><a href=https://yanirseroussi.com/tags/machine-intelligence/>Machine Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on x" href="https://x.com/intent/tweet/?text=ChatGPT%20is%20transformative%20AI&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f&amp;hashtags=artificialintelligence%2cfuturism%2cmachineintelligence%2cmachinelearning"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f&amp;title=ChatGPT%20is%20transformative%20AI&amp;summary=ChatGPT%20is%20transformative%20AI&amp;source=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f&title=ChatGPT%20is%20transformative%20AI"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on whatsapp" href="https://api.whatsapp.com/send?text=ChatGPT%20is%20transformative%20AI%20-%20https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on telegram" href="https://telegram.me/share/url?text=ChatGPT%20is%20transformative%20AI&amp;url=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share ChatGPT is transformative AI on ycombinator" href="https://news.ycombinator.com/submitlink?t=ChatGPT%20is%20transformative%20AI&u=https%3a%2f%2fyanirseroussi.com%2f2022%2f12%2f11%2fchatgpt-is-transformative-ai%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1><div class=comment-header><a href=#comment-1><img class=comment-avatar src="https://avatars.githubusercontent.com/u/7493721?s=50&v=4"><p class=comment-info><strong>Ralph Haygood</strong><br><small>2022-12-12 23:32:24</small></p></a></div><div class="comment-body post-content"><p>I&rsquo;m amused that the first sample on the home page features the user asking, &ldquo;this code is not working like i expect - how do i fix it?&rdquo; In my mind, I hear the voice of HAL (Douglas Rain) answer, &ldquo;This sort of thing has cropped up before, and it has always been due to human error.&rdquo;</p><p>I&rsquo;ve been assuming ChatGPT is just the latest specimen of what typically passes for AI these days, a system with an elaborate model of utterances disconnected from any deeper and richer model of the world to which utterances refer, hence brittle and shallow. (Such systems more or less realize John Searle&rsquo;s &ldquo;Chinese room&rdquo; scenario, although unlike Searle, I don&rsquo;t think they represent any fundamental limit on AI, merely the current, crude state of the art.) However, you&rsquo;ve convinced me to try it out.</p></div></div><div class=comment-level-1 id=comment-2><div class=comment-header><a href=#comment-2><img class=comment-avatar src="https://avatars.githubusercontent.com/u/3952615?s=50&v=4"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2022-12-14 19:58:34</small></p></a></div><div class="comment-body post-content">Thanks Ralph! It&rsquo;s definitely still early days, but it feels like a step change in chatbot tech, much like the significant improvements in image recognition from a decade ago. I can see it only getting more capable with all the interaction data that they&rsquo;re collecting.</div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/04/21/remaining-relevant-as-a-small-language-model/index.html b/2023/04/21/remaining-relevant-as-a-small-language-model/index.html
index 39c342902..c877d6b4f 100644
--- a/2023/04/21/remaining-relevant-as-a-small-language-model/index.html
+++ b/2023/04/21/remaining-relevant-as-a-small-language-model/index.html
@@ -11,7 +11,7 @@
 https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/getting-replaced-by-ai-using-ai.jpg 500w," src=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/getting-replaced-by-ai-using-ai.jpg alt="Image of people using AI getting replaced by AI using AI" loading=lazy></a><figcaption><p>What might happen once you&rsquo;ve finally mastered the latest AI tools. Source: <a href=https://www.reddit.com/r/singularity/comments/1203xqi/ai_wont_replace_you_but_people_who_are_using_ai/ target=_blank rel=noopener>Reddit</a></p></figcaption></figure></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/futurism/>Futurism</a></li><li><a href=https://yanirseroussi.com/tags/machine-intelligence/>Machine Intelligence</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on x" href="https://x.com/intent/tweet/?text=Remaining%20relevant%20as%20a%20small%20language%20model&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f&amp;hashtags=artificialintelligence%2ccareer%2cfuturism%2cmachineintelligence"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f&amp;title=Remaining%20relevant%20as%20a%20small%20language%20model&amp;summary=Remaining%20relevant%20as%20a%20small%20language%20model&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f&title=Remaining%20relevant%20as%20a%20small%20language%20model"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on whatsapp" href="https://api.whatsapp.com/send?text=Remaining%20relevant%20as%20a%20small%20language%20model%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on telegram" href="https://telegram.me/share/url?text=Remaining%20relevant%20as%20a%20small%20language%20model&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Remaining relevant as a small language model on ycombinator" href="https://news.ycombinator.com/submitlink?t=Remaining%20relevant%20as%20a%20small%20language%20model&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f04%2f21%2fremaining-relevant-as-a-small-language-model%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p><div class=comment-level-0 id=comment-1><div class=comment-header><a href=#comment-1><img class=comment-avatar src="https://avatars.githubusercontent.com/u/4347637?s=50&v=4"><p class=comment-info><strong>JCHEW</strong><br><small>2023-04-21 08:44:02</small></p></a></div><div class="comment-body post-content"><p>Reading of the despair of the many silent/invisible contributors of intellectual property copied, without financial recompense, which went into these LLMs. It&rsquo;s a training data gold rush/free for all right now, just as unedifying and thoughtless as actual gold rushes in history were. I fear this will also quickly lead to Apple-style closed gardens of proprietary creative talent (think of top artists, writers and thinkers signed up to train AI rather than creating content for direct human consumption). Counter-ML tech like Glaze will only delay the inevitable. <a href=https://glaze.cs.uchicago.edu/ target=_blank rel=noopener>https://glaze.cs.uchicago.edu/</a>.</p><p>I can think of some ways where governments might respond, e.g. special taxes and incentives on AI businesses to fund creative academies and collectives, much as public universities are today.</p></div></div><div class=comment-level-1 id=comment-2><div class=comment-header><a href=#comment-2><img class=comment-avatar src="https://avatars.githubusercontent.com/u/3952615?s=50&v=4"><p class=comment-info><strong>Yanir Seroussi</strong><br><small>2023-04-22 00:34:00</small></p></a></div><div class="comment-body post-content"><p>Thanks John! Yeah, I doubt that tech like Glaze can be made future proof, as they admit on that page. Besides, I lean more towards the view that <a href="https://www.youtube.com/watch?v=jcvd5JZkUXY" target=_blank rel=noopener>all creative work is derivative</a> and <a href="https://www.youtube.com/watch?v=IeTybKL1pM4" target=_blank rel=noopener>copying isn&rsquo;t theft</a>. Copyright mostly protects platforms and businesses rather than individuals. While I empathise with individuals who feel like their work is being exploited without their permission, I don&rsquo;t see the training of machine learning models as being that different from artists learning from other artists.</p><p>Thoughtful government intervention would be great, but it&rsquo;s unlikely to be applied in a timely manner or evenly across jurisdictions.</p></div></div></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/05/26/how-hackable-are-automated-coding-assessments/index.html b/2023/05/26/how-hackable-are-automated-coding-assessments/index.html
index 9e6b6c1d6..497974e35 100644
--- a/2023/05/26/how-hackable-are-automated-coding-assessments/index.html
+++ b/2023/05/26/how-hackable-are-automated-coding-assessments/index.html
@@ -17,7 +17,7 @@
 https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/blind-code-assessment-game.png 1244w," src=https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/blind-code-assessment-game_hu4596bb32dd7a8ed7b3602d876d56970a_249362_800x0_resize_box_3.png alt="Screenshot from a Blind thread on coding assessments" loading=lazy></a><figcaption><p>Some comments from a <a href=https://www.teamblind.com/post/Thank-you-for-fcking-up-the-coding-industry-GchJZKmF target=_blank rel=noopener>Blind thread on coding assessments</a>. Seeing it all as a somewhat-useful game is probably the way to go.</p></figcaption></figure><p><small><em>Note: I reached out to CodeSignal for a comment on this post, but haven&rsquo;t heard back after more than a week.</em></small></p><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>As with many Paul Graham essays, I find myself in agreement with some of his ideas and disagreement with others. But hackable tests are definitely a thing, e.g., see <a href=https://en.wikipedia.org/wiki/Teaching_to_the_test target=_blank rel=noopener>teaching to the test</a> and <a href=https://en.wikipedia.org/wiki/Campbell%27s_law target=_blank rel=noopener>Campbell&rsquo;s law</a>.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:2><p>Taking a machine learning analogy, asking the same questions repeatedly is likely to lead to <a href=https://en.wikipedia.org/wiki/Overfitting target=_blank rel=noopener>overfitting</a>. Drawing new questions from the same distribution is akin to adding a validation set, while dealing with the sort of problems encountered outside standardised tests is indicative of <a href=https://en.wikipedia.org/wiki/Generalization_error target=_blank rel=noopener>the generalisation error</a> of the test taker.&#160;<a href=#fnref:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:3><p>It says a lot about the hackability of higher education that the Australian government requires a PhD graduate from a top Australian university to prove that their English skills haven&rsquo;t deteriorated after four years in Australia. Similarly, companies that look at educational pedigree but put recent graduates through their own set of tests implicitly distrust the grades given by universities.&#160;<a href=#fnref:3 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:4><p>The only time you&rsquo;re likely to face ridiculous time pressures that are measured in minutes is when something breaks in production. Production issues can be minimised through investment in solid processes and quality over a project&rsquo;s lifetime. That is, you go slow to go fast and avoid fire-fighting. Take-home exams and real-work simulations are more reflective of the sort of thinking that&rsquo;s required from senior engineers because good ideas often manifest when you take the time to design a system and avoid jumping into code-like-hell mode. Going with the first thing that comes to mind is a habit that&rsquo;s better left to chatbots.&#160;<a href=#fnref:4 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:5><p>Preparation is partly a function of motivation to pass the test, which is a positive indicator despite being unrelated to possessing the skills the test purports to measure. In my case, motivation to maximise the score was lacking, so the company got useful information out of my imperfect score. Why was my motivation lacking? Because the role seemed interesting enough to apply to, but not worth working too hard to get. The opportunity cost of neglecting my other endeavours in favour of test hacking seemed too high.&#160;<a href=#fnref:5 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:6><p>See page 11 of the linked survey. Like other materials from CodeSignal, it&rsquo;s somewhat comical. They state that <em>&ldquo;candidates view CodeSignal assessments more favorably than timed coding assessments in general (p = 0.034)&rdquo;</em>, but looking at the table, the mean score given to CodeSignal assessments is 3.41 / 5, while general timed coding assessments were given a mean score of 3.37. That is, a difference of 0.04 – it&rsquo;s hard to call this <a href=https://en.wikipedia.org/wiki/Clinical_significance target=_blank rel=noopener>practically significant</a>, despite the p-value. Could it be that CodeSignal&rsquo;s <a href=https://codesignal.com/solutions/io-psychologists/ target=_blank rel=noopener>IO psychologists</a> missed the many memos on p-value pitfalls, such as <a href=https://www.amstat.org/asa/files/pdfs/p-valuestatement.pdf target=_blank rel=noopener>the one by the American Statistical Association</a>? In any case, if they consider the 0.04 difference to be notable, why do they say nothing about the 0.06 difference in favour of take-home coding assignments or the 0.17 difference in favour of coding interviews? Personally, I&rsquo;d also report the full distribution rather than just the means. It&rsquo;s easy enough to visualise a five-point scale.&#160;<a href=#fnref:6 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/hackers/>Hackers</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on x" href="https://x.com/intent/tweet/?text=How%20hackable%20are%20automated%20coding%20assessments%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f&amp;hashtags=artificialintelligence%2ccareer%2chackers%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f&amp;title=How%20hackable%20are%20automated%20coding%20assessments%3f&amp;summary=How%20hackable%20are%20automated%20coding%20assessments%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f&title=How%20hackable%20are%20automated%20coding%20assessments%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on whatsapp" href="https://api.whatsapp.com/send?text=How%20hackable%20are%20automated%20coding%20assessments%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on telegram" href="https://telegram.me/share/url?text=How%20hackable%20are%20automated%20coding%20assessments%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How hackable are automated coding assessments? on ycombinator" href="https://news.ycombinator.com/submitlink?t=How%20hackable%20are%20automated%20coding%20assessments%3f&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f05%2f26%2fhow-hackable-are-automated-coding-assessments%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/index.html b/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/index.html
index a8cfa01c2..5f396dd18 100644
--- a/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/index.html
+++ b/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/index.html
@@ -11,7 +11,7 @@
 https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/dilbert-big-data.jpg 904w," src=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/dilbert-big-data_hu52b7900d669ddfe03bdd894fda67e436_330047_800x0_resize_q75_box.jpg alt="Dilbert big data" loading=lazy></a><figcaption><p>Will big data ever make a resurgence as a silver bullet?</p></figcaption></figure><p><strong>(M5) Lack of automated source-code control.</strong> While McConnell&rsquo;s 2008 report found this to be a low-frequency mistake, data science may have helped resuscitate it. This is due to multiple factors:</p><ul><li><em>Using notebooks for development and experimentation.</em> I use notebooks myself – they are popular for a reason. However, they don&rsquo;t play well with source control systems without additional tooling. For example, it&rsquo;s hard to collaborate on notebooks as people do on plain code – just try merging notebook changes from multiple authors for a bit of fun.</li><li><em>Many data scientists came from fields where source control wasn&rsquo;t common.</em> This is probably decreasing now with Git being the standard source control system, but it wasn&rsquo;t the case about a decade ago.</li><li><em>Data transformations don&rsquo;t always live under source control.</em> For example, analytics flows might be buried in stored procedures or database views, or worse – copied around by analysts. Again, this is changing thanks to a growing awareness and the rise of tools like <a href=https://www.getdbt.com/product/what-is-dbt/ target=_blank rel=noopener>dbt</a>, but <a href=https://hightouch.com/blog/you-dont-need-the-mds target=_blank rel=noopener>not everyone has adopted the modern data stack yet</a>.</li></ul><p>Related to a lack of source-code control is a lack of control over model versioning and data lineage, as it takes some experience to develop an appreciation of the need for versioning and reproducibility. Still, it&rsquo;s easy to end up with a big mess even with an awareness of these issues and the best intentions. It&rsquo;s hard to control everything.</p><h2 id=learning-from-history-while-moving-forward>Learning from history while moving forward<a hidden class=anchor aria-hidden=true href=#learning-from-history-while-moving-forward>#</a></h2><p>Data science is maturing. We&rsquo;ve gone from a &ldquo;sexy&rdquo; field to people calling themselves <a href=https://josephreis.com/2020/10/12/what-is-a-recovering-data-scientist/ target=_blank rel=noopener>&ldquo;recovering data scientists&rdquo;</a>, <a href=https://ryxcommar.com/2022/11/27/goodbye-data-science/ target=_blank rel=noopener>saying goodbye to the field</a>, and declaring that <a href=https://www.forbes.com/sites/forbestechcouncil/2019/02/04/why-there-will-be-no-data-science-job-titles-by-2029/ target=_blank rel=noopener><em>&ldquo;there will be no data science job titles by 2029&rdquo;</em></a>. Anecdotally, it seems to me like data science is becoming less of a failure mode of software engineering – perhaps because data scientists are no longer expected to single-handedly ship complex software systems.</p><p>Personally, I still struggle with giving a concise title to what I do, just like in 2012. <em>Data scientist</em> has become a loaded term – some see data science as a cost centre that fails to deliver tangible results. As I&rsquo;ve never stopped doing software engineering, I try to emphasise it by saying that I&rsquo;m a <em>full-stack data scientist and software engineer</em>. This is a bit of a mouthful, and <em>full-stack</em> is also a loaded term, but it seems apt because I&rsquo;ve shipped production code that ranges from old-school C on pre-Android phones through data pipelines to web applications. My main concern these days is putting my skills to good use, <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>especially within climate tech and related areas</a>. But as I&rsquo;m freelancing, I went with <em>Data & AI Consultant</em> as my LinkedIn job title – maybe my PhD specialisation in AI wasn&rsquo;t so silly after all&mldr;?</p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/zJ2oK4FYWH8?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="Sundar Pichai says AIAIAIAIAI"></iframe></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on x" href="https://x.com/intent/tweet/?text=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f&amp;hashtags=artificialintelligence%2ccareer%2cdatascience%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f&amp;title=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f&amp;summary=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f&title=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on whatsapp" href="https://api.whatsapp.com/send?text=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on telegram" href="https://telegram.me/share/url?text=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Was data science a failure mode of software engineering? on ycombinator" href="https://news.ycombinator.com/submitlink?t=Was%20data%20science%20a%20failure%20mode%20of%20software%20engineering%3f&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f06%2f30%2fwas-data-science-a-failure-mode-of-software-engineering%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/index.html b/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/index.html
index 823b0146b..fd05fd742 100644
--- a/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/index.html
+++ b/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/ditcherville-39-i-love-your-email-list.png 1352w," src=https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/ditcherville-39-i-love-your-email-list_hue4c7fc22fce0574ea830c25d91fbdb2a_253127_800x0_resize_box_3.png alt="Comic strip showing trendy alternatives to email, year after year" loading=lazy></a><figcaption><p>Source: <a href=https://jonathanstark.com/ditcherville/39 target=_blank rel=noopener>Ditcherville: I love your email list!</a> <a href=https://creativecommons.org/licenses/by-nc/4.0/ target=_blank rel=noopener>CC BY-NC 4.0</a> by Jonathan Stark</p></figcaption></figure></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/blogging/>Blogging</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on x" href="https://x.com/intent/tweet/?text=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f&amp;hashtags=blogging%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f&amp;title=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web&amp;summary=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f&title=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on whatsapp" href="https://api.whatsapp.com/send?text=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on telegram" href="https://telegram.me/share/url?text=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My rediscovery of quiet writing on the open web on ycombinator" href="https://news.ycombinator.com/submitlink?t=My%20rediscovery%20of%20quiet%20writing%20on%20the%20open%20web&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f08%2f28%2fmy-rediscovery-of-quiet-writing-on-the-open-web%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/10/25/lessons-from-reluctant-data-engineering/index.html b/2023/10/25/lessons-from-reluctant-data-engineering/index.html
index 28e0ebf52..ddc6d4caf 100644
--- a/2023/10/25/lessons-from-reluctant-data-engineering/index.html
+++ b/2023/10/25/lessons-from-reluctant-data-engineering/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="career,data engineering,data science,software engineering"><meta name=description content="Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Lessons from reluctant data engineering"><meta property="og:description" content="Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/"><meta property="og:image" content="https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2023-10-25T04:45:00+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023.webp"><meta name=twitter:title content="Lessons from reluctant data engineering"><meta name=twitter:description content="Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Lessons from reluctant data engineering","item":"https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Lessons from reluctant data engineering","name":"Lessons from reluctant data engineering","description":"Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had.","keywords":["career","data engineering","data science","software engineering"],"articleBody":"In May 2023, I submitted the following talk abstract to the Brisbane DataEngBytes conference.\nAs we all know, solid data engineering is essential to the success of data science and AI applications. And yet, people often get excited about fancy machine learning models and neglect the data engineering layer. This is totally understandable: playing with data in a throwaway notebook is more relaxing than dealing with a data pipeline that keeps finding ways to break in production.\nIn this talk, I’ll share lessons on data engineering from a data science perspective. Everywhere I’ve worked, from small start-ups to established companies, I’ve found that I had to do some data engineering if I wanted my work to ever get to production. While I’ve always been reluctant to do too much of it, my engineering background has placed me in a better position to do it than colleagues who started off as analysts and academics.\nYou could call my work full-stack data science, reluctant data engineering, or some other data \u0026 AI thing. Whatever it is, I hope that my talk will help us all play better with each other, across all layers of the data stack.\nAs I don’t identify as a data engineer and have never attended a DataEngBytes conference, I didn’t know whether my talk would fit the agenda. However, it seemed harmless to submit an abstract and see how it goes.\nWhen I got the acceptance notification and realised I had to turn my abstract into a coherent talk, I was a bit wary of lacking a good grasp of who’s in my audience. However, when the full agenda was published, I realised that the focus of the conference won’t be on arcane data engineering knowledge, given that one of the keynotes was titled “How The Full-Stack Data Scientist Is STILL The Sexiest Job”. It turned out that despite the name and tagline (“by data engineers, for data engineers”), DataEngBytes was a great event for all data professionals.\nHere’s the video of the talk (slides):\nQuick summary. I start off with a disclaimer, stating that I am not a data engineer. Then I show evidence that the market values data engineering more than data science, given the ratio of Data Engineer to Data Scientist job ads (x3 in the AU$100-150k compensation range; x4 in the AU$200k+ range).1 I follow that observation with another disclaimer, stating that some of my lessons may be obvious or better learnt the hard way (as I often have to learn and relearn lessons). Then I detail five chronologically ordered snippets and their corresponding lessons:\n2012: My first data science job, where we made mistakes around technology choice and premature optimisation. The lesson is that shiny tech ain’t always shiny. Like all lessons, this one ends with a quote that shows that what I learned wasn’t entirely new. The first quote is by Donald Knuth from 1974: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” 2013: My first head of data science job, where we solved real scaling issues by following principles and adapting solutions to our situation. The lesson is that shiny tech can be transformative; but principles beat tools, which goes with a 1911 quote by Harrington Emerson: “As to methods, there may be a million and then some, but principles are few. The person who grasps principles can successfully select their own methods. The person who tries methods, ignoring principles, is sure to have trouble.”2 2015: My first enterprise consulting stint, where I experienced being a not-so-useful data scientist and working with some not-so-useful data engineers. This led me to dabble in “shadow IT” (a term I learned at the conference), and build a separate Python machine learning pipeline to work around various limitations. The lesson is that you should solve problems; don’t be the problem, or in the words of circa 2004 Google: “Focus on the user and all else will follow.” 2017: My first remote data science job, where I played around with many job functions across the data stack and went down various data rabbit holes. The lesson is to go deep; trust but verify, which goes with a 1999 quote by Eric S. Raymond: “Given enough eyeballs, all bugs are shallow.” 2022: My first committed climate and biodiversity moves (still a work in progress). The lesson is that tech \u0026 titles are tools; focus on what matters, but recall Rabbi Tarfon’s quote from almost two thousand years ago: “You are not obliged to complete the work, but neither are you free to desist from it.” The main takeaway from the talk is that data problems have human roots – and human solutions. This is because:\nHumans get excited by shiny tech… and produce transformative tech. Humans optimise prematurely… and when it makes sense. Humans can act as unreasonable blockers… and as the users we serve. Humans generate messy data… and clean it up. Humans get distracted by tools… and use them for beneficial ends. This is based on Seek searches for jobs advertised in July 2023. Given the limitations of Seek search, it’s not an accurate representation of the demand for each role, as the results included all ads that mentioned the terms. One could also argue that data engineers tend to change jobs more than data scientists, fuelling demand. Despite this, I think the results support the general message around the value of data engineering, especially as others have noted the need for 4-5 data engineers per data scientist in organisations with complex data engineering requirements. ↩︎\nEmerson referred to man rather than person in the original quote, but I took the liberty to make it gender-neutral and retain the original message. ↩︎\n","wordCount":"969","inLanguage":"en","image":"https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023.webp","datePublished":"2023-10-25T04:45:00Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Lessons from reluctant data engineering</h1><div class=post-meta><span title='2023-10-25 04:45:00 +0000 UTC'>October 25, 2023</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023_hu804925b1940ee0b95918a52a0d7d78df_87060_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023_hu804925b1940ee0b95918a52a0d7d78df_87060_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023.webp 676w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/yanir-seroussi-dataengbytes-brisbane-2023.webp alt="Yanir Seroussi presenting at DataEngBytes Brisbane 2023" width=676 height=450></figure><div class=post-content><p>In May 2023, I submitted the following talk abstract to the Brisbane <a href=https://dataengconf.com.au/ target=_blank rel=noopener>DataEngBytes</a> conference.</p><blockquote><p>As we all know, solid data engineering is essential to the success of data science and AI applications. And yet, people often get excited about fancy machine learning models and neglect the data engineering layer. This is totally understandable: playing with data in a throwaway notebook is more relaxing than dealing with a data pipeline that keeps finding ways to break in production.</p><p>In this talk, I&rsquo;ll share lessons on data engineering from a data science perspective. Everywhere I&rsquo;ve worked, from small start-ups to established companies, I&rsquo;ve found that I had to do some data engineering if I wanted my work to ever get to production. While I&rsquo;ve always been reluctant to do too much of it, my engineering background has placed me in a better position to do it than colleagues who started off as analysts and academics.</p><p>You could call my work full-stack data science, reluctant data engineering, or some other data & AI thing. Whatever it is, I hope that my talk will help us all play better with each other, across all layers of the data stack.</p></blockquote><p>As I don&rsquo;t identify as a data engineer and have never attended a DataEngBytes conference, I didn&rsquo;t know whether my talk would fit the agenda. However, it seemed harmless to submit an abstract and see how it goes.</p><p>When I got the acceptance notification and realised I had to turn my abstract into a coherent talk, I was a bit wary of lacking a good grasp of who&rsquo;s in my audience. However, when the full agenda was published, I realised that the focus of the conference won&rsquo;t be on arcane data engineering knowledge, given that one of the keynotes was titled <em>&ldquo;How The Full-Stack Data Scientist Is STILL The Sexiest Job&rdquo;</em>. It turned out that despite the name and tagline (<em>&ldquo;by data engineers, for data engineers&rdquo;</em>), DataEngBytes was a great event for all data professionals.</p><p>Here&rsquo;s the video of the talk (<a href=https://docs.google.com/presentation/d/100GiDkp3UKfQtWtxZOF4CaJWTuSYtkEYxkI0_INdqq8/edit target=_blank rel=noopener>slides</a>):</p><p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/NE6e7Xx7OLQ?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="Talk video: Lessons from reluctant data engineering"></iframe></div><br><strong>Quick summary.</strong> I start off with a disclaimer, stating that I am not a data engineer. Then I show evidence that the market values data engineering more than data science, given the ratio of <em>Data Engineer</em> to <em>Data Scientist</em> job ads (x3 in the AU$100-150k compensation range; x4 in the AU$200k+ range).<sup id=fnref:1><a href=#fn:1 class=footnote-ref role=doc-noteref>1</a></sup> I follow that observation with another disclaimer, stating that some of my lessons may be obvious or better learnt the hard way (as I often have to learn and relearn lessons). Then I detail five chronologically ordered snippets and their corresponding lessons:</p><ol><li>2012: My first data science job, where we made mistakes around technology choice and premature optimisation. The lesson is that <strong>shiny tech ain&rsquo;t always shiny</strong>. Like all lessons, this one ends with a quote that shows that what I learned wasn&rsquo;t entirely new. The first quote is by Donald Knuth from 1974: <em>&ldquo;We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.&rdquo;</em></li><li>2013: My first head of data science job, where we solved real scaling issues by following principles and adapting solutions to our situation. The lesson is that <strong>shiny tech can be transformative; but principles beat tools</strong>, which goes with a 1911 quote by Harrington Emerson: <em>&ldquo;As to methods, there may be a million and then some, but principles are few. The person who grasps principles can successfully select their own methods. The person who tries methods, ignoring principles, is sure to have trouble.&rdquo;</em><sup id=fnref:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup></li><li>2015: My first enterprise consulting stint, where I experienced being a not-so-useful data scientist and working with some not-so-useful data engineers. This led me to dabble in &ldquo;shadow IT&rdquo; (a term I learned at the conference), and build a separate Python machine learning pipeline to work around various limitations. The lesson is that you should <strong>solve problems; don’t be the problem</strong>, or in the words of circa 2004 Google: <em>&ldquo;Focus on the user and all else will follow.&rdquo;</em></li><li>2017: My first remote data science job, where I played around with many job functions across the data stack and went down various data rabbit holes. The lesson is to <strong>go deep; trust but verify</strong>, which goes with a 1999 quote by Eric S. Raymond: <em>&ldquo;Given enough eyeballs, all bugs are shallow.&rdquo;</em></li><li>2022: My first committed climate and biodiversity moves (still a work in progress). The lesson is that <strong>tech & titles are tools; focus on what matters</strong>, but recall Rabbi Tarfon&rsquo;s quote from almost two thousand years ago: <em>&ldquo;You are not obliged to complete the work, but neither are you free to desist from it.&rdquo;</em></li></ol><p>The main takeaway from the talk is that <strong>data problems have human roots – and human solutions</strong>. This is because:</p><ul><li>Humans get excited by shiny tech&mldr; and produce transformative tech.</li><li>Humans optimise prematurely&mldr; and when it makes sense.</li><li>Humans can act as unreasonable blockers&mldr; and as the users we serve.</li><li>Humans generate messy data&mldr; and clean it up.</li><li>Humans get distracted by tools&mldr; and use them for beneficial ends.</li></ul><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>This is based on <a href=https://www.seek.com.au/ target=_blank rel=noopener>Seek</a> searches for jobs advertised in July 2023. Given the limitations of Seek search, it&rsquo;s not an accurate representation of the demand for each role, as the results included all ads that <em>mentioned</em> the terms. One could also argue that data engineers tend to change jobs more than data scientists, fuelling demand. Despite this, I think the results support the general message around the value of data engineering, especially as <a href=https://www.oreilly.com/radar/data-engineers-vs-data-scientists/ target=_blank rel=noopener>others have noted the need for 4-5 data engineers per data scientist in organisations with complex data engineering requirements</a>.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:2><p>Emerson referred to <em>man</em> rather than <em>person</em> in the original quote, but I took the liberty to make it gender-neutral and retain the original message.&#160;<a href=#fnref:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on x" href="https://x.com/intent/tweet/?text=Lessons%20from%20reluctant%20data%20engineering&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f&amp;hashtags=career%2cdataengineering%2cdatascience%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f&amp;title=Lessons%20from%20reluctant%20data%20engineering&amp;summary=Lessons%20from%20reluctant%20data%20engineering&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f&title=Lessons%20from%20reluctant%20data%20engineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on whatsapp" href="https://api.whatsapp.com/send?text=Lessons%20from%20reluctant%20data%20engineering%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on telegram" href="https://telegram.me/share/url?text=Lessons%20from%20reluctant%20data%20engineering&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Lessons from reluctant data engineering on ycombinator" href="https://news.ycombinator.com/submitlink?t=Lessons%20from%20reluctant%20data%20engineering&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f10%2f25%2flessons-from-reluctant-data-engineering%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/index.html b/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/index.html
index 43f5816a8..fe121601a 100644
--- a/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/index.html
+++ b/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data engineering,data visualisation,machine learning,marine science,Reef Life Survey,software engineering,web development"><meta name=description content="Summarising the work Uri Seroussi and I did to improve Reef Life Survey&rsquo;s Reef Species of the World app."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Supporting volunteer monitoring of marine biodiversity with modern web and data tools"><meta property="og:description" content="Summarising the work Uri Seroussi and I did to improve Reef Life Survey&rsquo;s Reef Species of the World app."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/"><meta property="og:image" content="https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2023-11-29T02:00:00+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot.webp"><meta name=twitter:title content="Supporting volunteer monitoring of marine biodiversity with modern web and data tools"><meta name=twitter:description content="Summarising the work Uri Seroussi and I did to improve Reef Life Survey&rsquo;s Reef Species of the World app."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Supporting volunteer monitoring of marine biodiversity with modern web and data tools","item":"https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Supporting volunteer monitoring of marine biodiversity with modern web and data tools","name":"Supporting volunteer monitoring of marine biodiversity with modern web and data tools","description":"Summarising the work Uri Seroussi and I did to improve Reef Life Survey\u0026rsquo;s Reef Species of the World app.","keywords":["data engineering","data visualisation","machine learning","marine science","Reef Life Survey","software engineering","web development"],"articleBody":"I’ve been volunteering with the Reef Life Survey (RLS) citizen science project since 2015. RLS volunteers follow the same underwater visual census methodology that has been in use for decades, thereby producing data series that help inform the management of marine ecosystems. In simpler terms, we count fish (and some invertebrates), and this helps various organisations know what’s happening underwater. Among other places, RLS data has been used in scientific publications in Nature and elsewhere, and to inform the management of Australian marine parks.\nOver the years, I created a few online tools to help volunteers with survey work. These included web apps to visualise survey results and study species, as well as infer species from underwater photos. More recently, I agreed to help with the general maintenance of the non-WordPress parts of the RLS website and backend (somewhat reluctantly, but I suppose that’s what happens when you do things out of love).\nTaking greater responsibility to help the tech side of RLS along with an alignment of the research grant stars led to an opportunity to revamp the Reef Species of the World (RSoW) section – a collection of over 5,000 species with in-situ photos, descriptions, and empirical distributions derived from RLS surveys. My focus in this project was on product management, data pipelines, and backend work. I was joined by my brother, Uri Seroussi, who was in charge of front-end development (which became much more substantial than in the original RSoW).\nThe original RSoW was a traditional PHP application that relied on a MySQL database to serve requests, with most of the HTML constructed on the server. By contrast, we re-architected the new RSoW as a progressive web app using Next.js, which has the following advantages and new features:\nFully static site: served faster with reduced server load. Faster search and navigation: happens on the front-end without round-trips to the server. Installable app with offline availability: RSoW can now be installed as a mobile or desktop app, and run without an internet connection. Client-side image classification: offline availability includes image classification in the browser, which is useful when surveying in remote areas. Replacement of previous tools and pipelines: providing a more consistent user experience and improved data reliability. The rest of this post provides details on the architecture and implementation of the new RSoW and its underlying data and machine learning pipelines. But the best way of getting a feel for the data and the tools is to have a play yourself.\nThe new RSoW architecture diagram reflects the compromises between rebuilding and retaining legacy systems The RSoW web app We didn’t start with a blank slate: RSoW was already a public website, with many individual species pages ranking well on web searches (the main source of traffic). As such, a guiding principle was to retain as much of the original functionality as possible, and then build new features on top of it.\nWhen approaching a legacy codebase, there’s always the question of whether rebuilding parts or all of it is a worthwhile endeavour. As Jason Cohen notes, a more apt name for “legacy code” is “revenue code”, i.e., the code that embodies all the original and changed requirements, and has withstood the test of time. Even though RLS’s code isn’t meant to generate revenue, it’s always easy to mess things up when re-implementing existing functionality.\nThe main reasons we decided on a rewrite of the front-end were:\nUser experience: Speed things up, as some species searches were pretty slow due to server round-trips and inefficient database queries. Extensibility: Make it easier to add new features. Offline availability: This is impossible with a traditional PHP back-end, but feasible if all the data and code gets shipped to the client. We chose Next.js as the front-end framework since it’s well-established and supports static exports. Parts of the RLS website run on WordPress, so it’s easy to add statically-generated pages and serve them efficiently via Cloudflare (I wasn’t keen on complicating the stack by adding a Node backend). With static exports, we regenerate all the species pages whenever the data changes, which means that end-user page requests don’t need to touch the database. In addition, the main search page downloads three JSONs with all the data it needs to perform any species search (see sites.json, species.json, and surveys.json in the rls-data repo). Minified and compressed, these JSONs add up to less than 2MB of data, which isn’t tiny, but it is a small price to pay to avoid hitting the database. The JSONs also cache well on Cloudflare, like the rest of the web app’s files.\nFrom a user perspective, replicating the original functionality was the less exciting part of the project. Faster and less buggy code is obviously better, but once feature parity was achieved, we turned our attention to some new features:\nSupporting offline availability and installation by turning RSoW into a progressive web app: On its face, this was supposed to be simple given the next-pwa package, but it turned out to be a bit tricky because the original package was abandoned, and due to multiple layers of caching. It’s well-known that cache invalidation is one of the two hard problems in computer science (along with naming things and off-by-one errors), and progressive web apps offer a lovely variety of caches to deal with – everything needs to be cached on the client for offline availability. We got there after some tinkering and dealing with head-scratching bugs, some of which were caused by other caching layers in addition to the client-side caches (including Cloudflare and some misconfiguration of an early version of the app). Knowledge test: A separate grant came along and Uri had the opportunity to extend RSoW by adding a section that helps test new volunteers ahead of them joining RLS. Species frequency exploration: Bringing in the full functionality from the first tool I built for RLS back in 2017. Client-side image classification: Deprecating the Streamlit app I built a couple of years ago. Data and machine learning pipelines On the back-end, there was an opportunity to simplify things by retiring the original PHP code that processed survey data in favour of the pipelines I implemented in the rls-data repo. Ultimately, survey data comes from the Australian Ocean Data Network (AODN), which holds many more datasets in addition to RLS. Originally, the PHP code that processed survey data into the MySQL database evolved separately from rls-data, which I implemented to generate JSONs for the tools I built. As rls-data is an open source project and the raw survey data is relatively small (\u003c1GB), it made sense to process it with a daily GitHub Action (GHA) script that runs for free. The resultant JSONs are committed to the repo, which means that any unexpected changes are easily tracked (I keep an eye on the commits). It was simple to expand the existing rls-data pipelines to generate all the JSONs needed to serve RSoW, and then say goodbye to the PHP code that implemented similar functionality.\nI’m aware that running data pipelines with GitHub Actions isn’t going to win any awards for sophistication, but it’s a great fit for this project. The key principle is to use the right tool for the job, not the shiniest tool.\nOne part of the original RSoW that we barely touched was the management interface, which allows RLS admins to update species data and upload pictures. The gains from replacing the admin part of RSoW would have been negligible, so it still runs the old PHP code on top of MySQL. Unfortunately, this meant I couldn’t retire all the PHP data pipelines, as species data also comes from the Australian Ocean Data Network and is joined with the edits made by RLS admins. This exemplifies the pragmatism that one often needs to apply when faced with legacy revenue systems: If a system works and there’s no real benefit to replacing it, sticking with the old system is the right thing to do (even if it makes your architecture diagram more complicated).\nI have big plans to improve the machine learning model for inferring RLS species from user images, but it’s somehow never a priority. For RSoW, I did make it a priority to support serving the model with a simple API, but then I decided it’d be worth the effort to export it to ONNX for client-side image classification. This was partly driven by curiosity about ONNX, but it also had two key benefits: (1) support for offline classification; and (2) simplified \u0026 cheaper serving architecture, as ONNX models can be served from S3 and don’t require RLS to pay for server-side compute.\nAs to the machine learning pipelines, they all need to be manually triggered, which is fine since the image data changes slowly. These pipelines are implemented in notebooks and the command-line interface of the ichthywhat repo. I have a bit of a dream of this being an early precursor to complete automation of RLS data collection, with the historical RLS data series continued by divers who would mostly serve as video takers and fish scarers (using cameras without human divers would lead to different biases in the data). However, this is a big project that is probably best left to my next PhD, i.e., it may never happen.\nIn the meantime, I hope to continue diving with RLS, and aim to make pragmatic decisions to keep RSoW running and supporting the community.\n","wordCount":"1574","inLanguage":"en","image":"https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot.webp","datePublished":"2023-11-29T02:00:00Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Supporting volunteer monitoring of marine biodiversity with modern web and data tools</h1><div class=post-meta><span title='2023-11-29 02:00:00 +0000 UTC'>November 29, 2023</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot_huf880d5e70f6cbabeaf9d4b27c6c21664_34088_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot_huf880d5e70f6cbabeaf9d4b27c6c21664_34088_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot_huf880d5e70f6cbabeaf9d4b27c6c21664_34088_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot_huf880d5e70f6cbabeaf9d4b27c6c21664_34088_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/reef-species-of-the-world-screenshot.webp alt="Screenshot of Reef Species of the World" width=1200 height=591></figure><div class=post-content><p>I&rsquo;ve been volunteering with the <a href=https://reeflifesurvey.com/ target=_blank rel=noopener>Reef Life Survey</a> (RLS) citizen science project since 2015. RLS volunteers follow the same underwater visual census methodology that has been in use for decades, thereby producing data series that help inform the management of marine ecosystems. In simpler terms, we count fish (and some invertebrates), and this helps various organisations know what&rsquo;s happening underwater. Among other places, RLS data has been used in scientific publications in <a href=https://www.nature.com/articles/s41586-023-05833-y target=_blank rel=noopener>Nature</a> and <a href=https://reeflifesurvey.com/scientific-papers-management-reports/ target=_blank rel=noopener>elsewhere</a>, and to inform the <a href=https://parksaustralia.gov.au/marine/science/reef-life-survey/ target=_blank rel=noopener>management of Australian marine parks</a>.</p><p>Over the years, I created a few online tools to help volunteers with survey work. These included web apps to <a href=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/>visualise survey results and study species</a>, as well as <a href=https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/>infer species from underwater photos</a>. More recently, I agreed to help with the general maintenance of the non-WordPress parts of the RLS website and backend (<a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>somewhat reluctantly</a>, but I suppose that&rsquo;s what happens when you do things out of love).</p><p>Taking greater responsibility to help the tech side of RLS along with an alignment of the research grant stars led to an opportunity to revamp the <a href=https://reeflifesurvey.com/species/ target=_blank rel=noopener>Reef Species of the World</a> (RSoW) section – a collection of over 5,000 species with in-situ photos, descriptions, and empirical distributions derived from RLS surveys. My focus in this project was on product management, data pipelines, and backend work. I was joined by my brother, <a href=https://www.uriseroussi.com/ target=_blank rel=noopener>Uri Seroussi</a>, who was in charge of front-end development (which became much more substantial than in the original RSoW).</p><p>The original RSoW was a traditional PHP application that relied on a MySQL database to serve requests, with most of the HTML constructed on the server. By contrast, we re-architected the new RSoW as a progressive web app using Next.js, which has the following advantages and new features:</p><ul><li><strong>Fully static site:</strong> served faster with reduced server load.</li><li><strong>Faster search and navigation:</strong> happens on the front-end without round-trips to the server.</li><li><strong>Installable app with offline availability:</strong> RSoW can now be installed as a mobile or desktop app, and run without an internet connection.</li><li><strong>Client-side image classification:</strong> offline availability includes image classification in the browser, which is useful when surveying in remote areas.</li><li><strong>Replacement of previous tools and pipelines:</strong> providing a more consistent user experience and improved data reliability.</li></ul><p>The rest of this post provides details on the architecture and implementation of the new RSoW and its underlying data and machine learning pipelines. But the best way of getting a feel for the data and the tools is <a href=https://reeflifesurvey.com/species/ target=_blank rel=noopener>to have a play yourself</a>.</p><figure><a href=reef-species-of-the-world-architecture.svg target=_blank rel=noopener><img src=reef-species-of-the-world-architecture.svg alt="The new RSoW architecture diagram reflects the compromises between rebuilding and retaining legacy systems" loading=lazy></a><figcaption><p>The new RSoW architecture diagram reflects the compromises between rebuilding and retaining legacy systems</p></figcaption></figure><h2 id=the-rsow-web-app>The RSoW web app<a hidden class=anchor aria-hidden=true href=#the-rsow-web-app>#</a></h2><p>We didn&rsquo;t start with a blank slate: RSoW was already a public website, with many individual species pages ranking well on web searches (the main source of traffic). As such, a guiding principle was to retain as much of the original functionality as possible, and then build new features on top of it.</p><p>When approaching a legacy codebase, there&rsquo;s always the question of whether rebuilding parts or all of it is a worthwhile endeavour. As Jason Cohen notes, <a href=https://longform.asmartbear.com/scale target=_blank rel=noopener>a more apt name for &ldquo;legacy code&rdquo; is &ldquo;revenue code&rdquo;</a>, i.e., the code that embodies all the original and changed requirements, and has withstood the test of time. Even though RLS&rsquo;s code isn&rsquo;t meant to generate revenue, it&rsquo;s always easy to mess things up when re-implementing existing functionality.</p><p>The main reasons we decided on a rewrite of the front-end were:</p><ul><li><strong>User experience:</strong> Speed things up, as some species searches were pretty slow due to server round-trips and inefficient database queries.</li><li><strong>Extensibility:</strong> Make it easier to add new features.</li><li><strong>Offline availability:</strong> This is impossible with a traditional PHP back-end, but feasible if all the data and code gets shipped to the client.</li></ul><p>We chose Next.js as the front-end framework since it&rsquo;s well-established and supports static exports. Parts of the RLS website run on WordPress, so it&rsquo;s easy to add statically-generated pages and serve them efficiently via Cloudflare (I wasn&rsquo;t keen on complicating the stack by adding a Node backend). With static exports, we regenerate all the species pages whenever the data changes, which means that end-user page requests don&rsquo;t need to touch the database. In addition, the main search page downloads three JSONs with all the data it needs to perform any species search (see <code>sites.json</code>, <code>species.json</code>, and <code>surveys.json</code> in <a href=https://github.com/yanirs/rls-data/tree/master/output target=_blank rel=noopener>the <code>rls-data</code> repo</a>). Minified and compressed, these JSONs add up to less than 2MB of data, which isn&rsquo;t tiny, but it is a small price to pay to avoid hitting the database. The JSONs also cache well on Cloudflare, like the rest of the web app&rsquo;s files.</p><p>From a user perspective, replicating the original functionality was the less exciting part of the project. Faster and less buggy code is obviously better, but once feature parity was achieved, we turned our attention to some new features:</p><ul><li><strong>Supporting offline availability and installation</strong> by turning RSoW into a <a href=https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps target=_blank rel=noopener>progressive web app</a>: On its face, this was supposed to be simple given the <code>next-pwa</code> package, but it turned out to be a bit tricky because the original package was abandoned, and due to multiple layers of caching. It&rsquo;s well-known that <a href=https://martinfowler.com/bliki/TwoHardThings.html target=_blank rel=noopener>cache invalidation is one of the two hard problems in computer science</a> (along with naming things and off-by-one errors), and progressive web apps offer a lovely variety of caches to deal with – everything needs to be cached on the client for offline availability. We got there after some tinkering and dealing with head-scratching bugs, some of which were caused by other caching layers in addition to the client-side caches (including Cloudflare and some misconfiguration of an early version of the app).</li><li><strong>Knowledge test</strong>: A separate grant came along and Uri had the opportunity to extend RSoW by adding a section that helps test new volunteers ahead of them joining RLS.</li><li><strong>Species frequency exploration</strong>: Bringing in the full functionality from <a href=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/>the first tool I built for RLS back in 2017</a>.</li><li><strong>Client-side image classification</strong>: Deprecating <a href=https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/>the Streamlit app I built a couple of years ago</a>.</li></ul><h2 id=data-and-machine-learning-pipelines>Data and machine learning pipelines<a hidden class=anchor aria-hidden=true href=#data-and-machine-learning-pipelines>#</a></h2><p>On the back-end, there was an opportunity to simplify things by retiring the original PHP code that processed survey data in favour of the pipelines I implemented in <a href=https://github.com/yanirs/rls-data/ target=_blank rel=noopener>the <code>rls-data</code> repo</a>. Ultimately, survey data comes from the <a href=https://portal.aodn.org.au/ target=_blank rel=noopener>Australian Ocean Data Network</a> (AODN), which holds many more datasets in addition to RLS. Originally, the PHP code that processed survey data into the MySQL database evolved separately from <code>rls-data</code>, which I implemented to generate JSONs for the tools I built. As <code>rls-data</code> is an open source project and the raw survey data is relatively small (&lt;1GB), it made sense to process it with a daily GitHub Action (GHA) script that runs for free. The resultant JSONs are committed to the repo, which means that any unexpected changes are easily tracked (<a href=https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/>I keep an eye on the commits</a>). It was simple to expand the existing <code>rls-data</code> pipelines to generate all the JSONs needed to serve RSoW, and then say goodbye to the PHP code that implemented similar functionality.</p><p>I&rsquo;m aware that running data pipelines with GitHub Actions isn&rsquo;t going to win any awards for sophistication, but it&rsquo;s a great fit for this project. The key principle is to use the right tool for the job, not the shiniest tool.</p><p>One part of the original RSoW that we barely touched was the management interface, which allows RLS admins to update species data and upload pictures. The gains from replacing the admin part of RSoW would have been negligible, so it still runs the old PHP code on top of MySQL. Unfortunately, this meant I couldn&rsquo;t retire all the PHP data pipelines, as species data also comes from the Australian Ocean Data Network and is joined with the edits made by RLS admins. This exemplifies the pragmatism that one often needs to apply when faced with <strike>legacy</strike> revenue systems: If a system works and there&rsquo;s no real benefit to replacing it, sticking with the old system is the right thing to do (even if it makes your architecture diagram more complicated).</p><p>I have <a href=https://github.com/yanirs/ichthywhat/issues/3 target=_blank rel=noopener>big plans</a> to improve the machine learning model for inferring RLS species from user images, but it&rsquo;s somehow never a priority. For RSoW, I did make it a priority to support <a href=https://github.com/yanirs/ichthywhat/pull/11 target=_blank rel=noopener>serving the model with a simple API</a>, but then I decided it&rsquo;d be worth the effort to <a href=https://github.com/yanirs/ichthywhat/pull/20 target=_blank rel=noopener>export it to ONNX for client-side image classification</a>. This was partly driven by curiosity about <a href=https://onnx.ai/ target=_blank rel=noopener>ONNX</a>, but it also had two key benefits: (1) support for offline classification; and (2) simplified & cheaper serving architecture, as ONNX models can be served from S3 and don&rsquo;t require RLS to pay for server-side compute.</p><p>As to the machine learning pipelines, they all need to be manually triggered, which is fine since the image data changes slowly. These pipelines are implemented in <a href=https://github.com/yanirs/ichthywhat target=_blank rel=noopener>notebooks and the command-line interface of the <code>ichthywhat</code> repo</a>. I have a bit of a dream of this being an early precursor to complete automation of RLS data collection, with the historical RLS data series continued by divers who would mostly serve as video takers and fish scarers (using cameras without human divers would lead to different biases in the data). However, this is a big project that is probably best left to my next PhD, i.e., it may never happen.</p><p>In the meantime, I hope to continue diving with RLS, and aim to make pragmatic decisions to keep RSoW running and supporting the community.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-visualisation/>Data Visualisation</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on x" href="https://x.com/intent/tweet/?text=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f&amp;hashtags=dataengineering%2cdatavisualisation%2cmachinelearning%2cmarinescience%2cReefLifeSurvey%2csoftwareengineering%2cwebdevelopment"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f&amp;title=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools&amp;summary=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools&amp;source=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f&title=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on whatsapp" href="https://api.whatsapp.com/send?text=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools%20-%20https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on telegram" href="https://telegram.me/share/url?text=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools&amp;url=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Supporting volunteer monitoring of marine biodiversity with modern web and data tools on ycombinator" href="https://news.ycombinator.com/submitlink?t=Supporting%20volunteer%20monitoring%20of%20marine%20biodiversity%20with%20modern%20web%20and%20data%20tools&u=https%3a%2f%2fyanirseroussi.com%2f2023%2f11%2f29%2fsupporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/index.html b/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/index.html
index 278914df0..3defbc6be 100644
--- a/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/index.html
+++ b/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,blogging,data science,environment,personal"><meta name=description content="Shifting focus to &lsquo;Data & AI for Impact&rsquo;, with more startup-related content, increased posting frequency, and deeper audience engagement."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="New decade, new tagline: Data & AI for Impact"><meta property="og:description" content="Shifting focus to &lsquo;Data & AI for Impact&rsquo;, with more startup-related content, increased posting frequency, and deeper audience engagement."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/"><meta property="og:image" content="https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-01-19T00:00:00+00:00"><meta property="article:modified_time" content="2024-01-19T16:35:09+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo.png"><meta name=twitter:title content="New decade, new tagline: Data & AI for Impact"><meta name=twitter:description content="Shifting focus to &lsquo;Data & AI for Impact&rsquo;, with more startup-related content, increased posting frequency, and deeper audience engagement."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"New decade, new tagline: Data \u0026 AI for Impact","item":"https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"New decade, new tagline: Data \u0026 AI for Impact","name":"New decade, new tagline: Data \u0026 AI for Impact","description":"Shifting focus to \u0026lsquo;Data \u0026amp; AI for Impact\u0026rsquo;, with more startup-related content, increased posting frequency, and deeper audience engagement.","keywords":["artificial intelligence","blogging","data science","environment","personal"],"articleBody":"Exactly a decade ago, on 19th January 2014, I published my first post on this website (Kaggle beginner tips). In most of the following years, my tagline was Data Science and Beyond. While the beyond bit gave me an excuse to write about various topics, most posts were indeed around data science – an area that also became broader (arguably to the point of uselessness).\nWhile I’ve never abandoned my software engineering roots, the broadening of data science means that many data scientists can no longer be assumed to possess solid engineering skills. Therefore, I changed the tagline last year to Engineering Data Science \u0026 More. However, this didn’t feel quite right – some people now have an adverse reaction to any mention of data science, after negative experiences of failed projects.\nRecently, I switched the tagline to be both broader and narrower: Data \u0026 AI for Nature. However, upon reflection and given some feedback, I realised that the Nature bit may be off-putting to some people who do impactful work in the space but have different motivations. Therefore, I decided to go with Data \u0026 AI for Impact (for now…).\nMore importantly, I’m planning to revitalise my approach to publishing and audience engagement:\nPost more frequently – aiming for weekly from February onwards. Use the mailing list to email full posts, and as a two-way avenue for comments and conversations (as opposed to public comments, which are now closed). Still publish both technical and high-level posts on Data \u0026 AI. Produce content that’s specifically useful for startups and scaleups that are early on their Data \u0026 AI journey. Showcase positive-impact applications of Data \u0026 AI tech – especially by startups in the climate and nature-positive space. With more frequent posts, what I publish should be quicker to produce and consume. This means I may lean more heavily on showcasing other people’s work – possibly through interviews. Other than that, here are some rough post ideas for the immediate future:\nSeries on a minimum viable data stack Best practices and opinions on a startup’s first data hire Answering questions people ask on the future of data science My experience as a Data Tech Lead with Work on Climate Use cases for ChatGPT and other LLMs Catching up on different aspects of LLMs / AI tech Opportunities for Data \u0026 AI professionals in the energy transition Historically, for each post I’ve published, about 5-10 ideas went unpublished. I hope that by aiming for shorter and lower-friction publishing, more posts will see the light of day.\nMy long-term aims are to learn by publishing, apply my Data \u0026 AI skills towards more positive impact, and help others in the space. Rather than sinking into doom and gloom, I’d like to focus on positive applications of Data \u0026 AI tech that make our world better (in the spirit of publications like Volts).\nCall to action:\nIf this all sounds uninteresting to you, you’re welcome to unsubscribe – no hard feelings. If you know people I should talk to and feature in future posts, I’d appreciate an intro. If you have any suggestions, please send them by replying to any of my emails, or contact me through other means – I’d love to hear from you. ","wordCount":"542","inLanguage":"en","image":"https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo.png","datePublished":"2024-01-19T00:00:00Z","dateModified":"2024-01-19T16:35:09+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">New decade, new tagline: Data & AI for Impact</h1><div class=post-meta><span title='2024-01-19 00:00:00 +0000 UTC'>January 19, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo_hub3544c14ea5e8010aed16c5375199374_583819_360x0_resize_box_3.png 360w ,https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo_hub3544c14ea5e8010aed16c5375199374_583819_480x0_resize_box_3.png 480w ,https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo_hub3544c14ea5e8010aed16c5375199374_583819_720x0_resize_box_3.png 720w ,https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo_hub3544c14ea5e8010aed16c5375199374_583819_1080x0_resize_box_3.png 1080w ,https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo_hub3544c14ea5e8010aed16c5375199374_583819_1500x0_resize_box_3.png 1500w ,https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo.png 2280w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/data-and-ai-for-impact-logo.png alt="Logo of Yanir Seroussi's consulting services, depicting a wave and an up-and-to-the-right graph." width=2280 height=1140></figure><div class=post-content><p>Exactly a decade ago, on 19th January 2014, I published my first post on this website (<a href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/>Kaggle beginner tips</a>). In most of the following years, my tagline was <em>Data Science and Beyond</em>. While the <em>beyond</em> bit gave me an excuse to write about various topics, <a href=https://yanirseroussi.com/tags/data-science/>most posts were indeed around data science</a> – an area that <a href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/>also became broader</a> (arguably to <a href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/>the point of uselessness</a>).</p><p>While I&rsquo;ve never abandoned my software engineering roots, the broadening of data science means that <a href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/>many data scientists can no longer be assumed to possess solid engineering skills</a>. Therefore, I changed the tagline last year to <em>Engineering Data Science & More</em>. However, this didn&rsquo;t feel quite right – some people now have an adverse reaction to any mention of data science, after negative experiences of failed projects.</p><p>Recently, I switched the tagline to be both broader and narrower: <em>Data & AI for Nature</em>. However, upon reflection and given some feedback, I realised that the <em>Nature</em> bit may be off-putting to some people who do impactful work in the space but have different motivations. Therefore, I decided to go with <em>Data & AI for Impact</em> (for now&mldr;).</p><p><strong>More importantly, I&rsquo;m planning to revitalise my approach to publishing and audience engagement:</strong></p><ul><li>Post more frequently – aiming for weekly from February onwards.</li><li>Use the mailing list to email full posts, and as a two-way avenue for comments and conversations (as opposed to public comments, which are now closed).</li><li>Still publish both technical and high-level posts on Data & AI.</li><li>Produce content that&rsquo;s specifically useful for startups and scaleups that are early on their Data & AI journey.</li><li>Showcase positive-impact applications of Data & AI tech – especially by startups in the climate and nature-positive space.</li></ul><p>With more frequent posts, what I publish should be quicker to produce and consume. This means I may lean more heavily on showcasing other people&rsquo;s work – possibly through interviews. Other than that, here are some rough post ideas for the immediate future:</p><ul><li>Series on a minimum viable data stack</li><li>Best practices and opinions on a startup&rsquo;s first data hire</li><li>Answering questions people ask on the future of data science</li><li>My experience as a Data Tech Lead with Work on Climate</li><li>Use cases for ChatGPT and other LLMs</li><li>Catching up on different aspects of LLMs / AI tech</li><li>Opportunities for Data & AI professionals in the energy transition</li></ul><p>Historically, for each post I&rsquo;ve published, about 5-10 ideas went unpublished. I hope that by aiming for shorter and lower-friction publishing, more posts will see the light of day.</p><p>My long-term aims are to learn by publishing, apply my Data & AI skills towards more positive impact, and help others in the space. Rather than sinking into doom and gloom, I&rsquo;d like to focus on positive applications of Data & AI tech that make our world better (in the spirit of publications like <a href=https://www.volts.wtf/ target=_blank rel=noopener>Volts</a>).</p><p><strong>Call to action:</strong></p><ul><li>If this all sounds uninteresting to you, you&rsquo;re welcome to unsubscribe – no hard feelings.</li><li>If you know people I should talk to and feature in future posts, I&rsquo;d appreciate an intro.</li><li>If you have any suggestions, please send them by replying to any of my emails, or <a href=https://yanirseroussi.com/contact/>contact me through other means</a> – I&rsquo;d love to hear from you.</li></ul></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/blogging/>Blogging</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on x" href="https://x.com/intent/tweet/?text=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f&amp;hashtags=artificialintelligence%2cblogging%2cdatascience%2cenvironment%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f&amp;title=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact&amp;summary=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f&title=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on whatsapp" href="https://api.whatsapp.com/send?text=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on telegram" href="https://telegram.me/share/url?text=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share New decade, new tagline: Data & AI for Impact on ycombinator" href="https://news.ycombinator.com/submitlink?t=New%20decade%2c%20new%20tagline%3a%20Data%20%26%20AI%20for%20Impact&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f01%2f19%2fnew-decade-new-tagline-data-and-ai-for-impact%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/index.html b/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/index.html
index 9206b5665..a4603bda0 100644
--- a/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/index.html
+++ b/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="analytics,business,career,data engineering,data science,startups"><meta name=description content="Advice for hiring a startup&rsquo;s first data person: match skills to business needs, consider contractors, and get help from data people."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Substance over titles: Your first data hire may be a data scientist"><meta property="og:description" content="Advice for hiring a startup&rsquo;s first data person: match skills to business needs, consider contractors, and get help from data people."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/"><meta property="og:image" content="https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-02-05T02:45:00+00:00"><meta property="article:modified_time" content="2024-02-19T11:25:54+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person.webp"><meta name=twitter:title content="Substance over titles: Your first data hire may be a data scientist"><meta name=twitter:description content="Advice for hiring a startup&rsquo;s first data person: match skills to business needs, consider contractors, and get help from data people."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Substance over titles: Your first data hire may be a data scientist","item":"https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Substance over titles: Your first data hire may be a data scientist","name":"Substance over titles: Your first data hire may be a data scientist","description":"Advice for hiring a startup\u0026rsquo;s first data person: match skills to business needs, consider contractors, and get help from data people.","keywords":["analytics","business","career","data engineering","data science","startups"],"articleBody":"If you search the web for ‘first startup data hire’, you may come across some strongly-worded advice claiming that this person must not be a data scientist, or that they must be a data engineer / analyst. In my view, being so prescriptive about titles risks missing out on great candidates. The reality is that titles in the data world are messy and fluid – it’s best to start by getting clear on what the data person is going to do, and proceed from there.\nBeyond titles, this post summarises my perspective on questions that arise around the first data hire, and presents some pointers to help you hire successfully.\nAssumptions and Timing Key assumption: If your startup needs a data hire, you’re probably at a stage where you’re starting to be limited by visibility into your data. You are generating revenue, but your data is all over the place (spreadsheets, dashboards of various tools, unstructured logs, etc.). There are important questions about your business that you can’t answer because you’re either not collecting the data, or because it’s too hard to gather it into a coherent story. No one on your technical team has informed opinions on tools to pick out of the dozens of options for warehousing / ingestion / transformation / analytics / orchestration.\nIf this is the case, then data and machine learning isn’t core to your product. You need someone to set up your data pipelines and analytics. These will primarily serve internal-facing use cases, like driving marketing decisions. However, the first question to ask is: Do you really need to hire someone for a permanent full-time position?\nPersonally, I’m biased in favour of not hiring (yet): You can get started on your data journey with a contractor or a part-time (aka fractional) person. This should give you a better understanding of your data needs, and get you to a better place in terms of data infrastructure and dashboards. This person may also want to become a full-timer down the track, or help you with hiring other data people.\nRemember that – by definition – premature hiring unnecessarily shortens your runway. Hiring and onboarding a full-timer would usually take longer than bringing on an experienced contractor. And if you need to let them go, it may adversely affect team morale. This doesn’t apply to contractors, who are expected to leave when their contract is over.\nThat said, there is value in retaining a long-term owner of your data and analytics. Every business, dataset, and data stack have their quirks, so the familiarity that comes with long-term ownership is a definite point in favour of hiring for a permanent role. That said, you should still be open to part-time if you don’t have full-time needs yet.\nTitles and Skills If you do decide to hire for a permanent role, there are three other articles worth reading:\nAndrew Bartholomew covers assumptions (similar to the above), responsibilities, skills, management, and the thorny question of titles. He says that the person’s title is “the least important question […] you’re hiring a Senior Analytics Engineer or a Senior Data Analyst, but in practice this person might prefer a Senior Data Scientist title, or Analytics Lead, or something else.” I agree with this and pretty much everything else in Andrew’s article, though it is important to align on the expectations implied by titles (more on this below). Colleen Tartow advocates for hiring a senior data engineer. While Colleen’s advice is sensible, I’d be careful with following it blindly due to the messiness of titles and experiences. For example, you probably don’t want a data engineer who’s only worked with big companies, as there’s a risk that they’d over-engineer your data stack (initially, you’re aiming for a minimum viable data stack). Also, if they’ve only ever worn the data engineer hat, they may find it hard to uncover and communicate the insights you’re after. Sebastian Hewing goes deep into the question of timing the hire as a function of product-market fit. I agree with most points, but disagree with this phrasing: “The last person you want is a Data Scientist. […] What you need, in my opinion, is a Head of Data \u0026 Analytics.” I believe that someone who has full-stack data science experience may make a great Head of Data \u0026 Analytics – it all comes down to skills and experiences rather than past titles, which can only ever tell a part of the story. That said, Sebastian does list a bunch of other data titles that the startup shouldn’t hire, so we probably agree on the essence of the role and the person. I especially like Sebastian’s emphasis on seeking a hands-on data person who can turn data into insights AND insights into action. As you can see, the three articles disagree on the question of titles, with Andrew’s being the most pragmatic. If you want to get even more confused, ask ChatGPT to summarise the collective wisdom of the internet: When I asked it “what should a startup’s first data hire be?”, ChatGPT suggested seven(!) roles with an “it depends” reason for each one. Personally, I’d go for a senior data generalist with an engineering background, who is also attentive to the business side. It’s highly doubtful you’d find someone who goes by this title, so you’ll need to figure out how to find and attract them. This is hard if you’re not familiar with the data space. It’s worth seeking help from data folks in your network, or starting with a contractor to bootstrap the process.\nSummary Putting it all together, once you’ve read the above articles, my opinion is that you should:\nGet clear on the business needs that’d be addressed by a data person. Err on the side of not hiring prematurely – consider a contractor or rely on your current employees. When you’re ready to hire, sketch out a high-level plan for the person’s first 90-180-360 days. Run the plan and job description by some data people you trust. Possible title for the job ad: Data \u0026 Analytics Lead or Head of Data \u0026 Analytics (but you want a hands-on person, so make it clear that this is an individual contributor role initially). Make the plan a part of the job ad – it helps with aligning expectations. Ideally, get data people you trust to help you with the hiring process. Screen out specialists early, regardless of past titles and pedigree. Make expectations as clear as possible during the hiring process – especially if the person hasn’t worked with a startup before. Hire someone who’s a great fit who would help take your business to the next level. Any thoughts or suggestions? Please contact me – I will make edits to this post based on feedback.\n","wordCount":"1131","inLanguage":"en","image":"https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person.webp","datePublished":"2024-02-05T02:45:00Z","dateModified":"2024-02-19T11:25:54+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Substance over titles: Your first data hire may be a data scientist</h1><div class=post-meta><span title='2024-02-05 02:45:00 +0000 UTC'>February 5, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person_hu80e76285650442be211f4d770c7e8090_62104_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person_hu80e76285650442be211f4d770c7e8090_62104_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person_hu80e76285650442be211f4d770c7e8090_62104_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person.webp 1024w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/versatile-data-person.webp alt="ChatGPT's version of a versatile data person" width=1024 height=540></figure><div class=post-content><p>If you search the web for <em>&lsquo;first startup data hire&rsquo;</em>, you may come across some strongly-worded advice claiming that this person <em>must</em> not be a data scientist, or that they <em>must</em> be a data engineer / analyst. In my view, being so prescriptive about titles risks missing out on great candidates. The reality is that titles in the data world are messy and fluid – <strong>it&rsquo;s best to start by getting clear on what the data person is going to do, and proceed from there.</strong></p><p>Beyond titles, this post summarises my perspective on questions that arise around the first data hire, and presents some pointers to help you hire successfully.</p><h2 id=assumptions-and-timing>Assumptions and Timing<a hidden class=anchor aria-hidden=true href=#assumptions-and-timing>#</a></h2><p><strong>Key assumption:</strong> If your startup needs a data hire, you&rsquo;re probably at a stage where you&rsquo;re starting to be limited by visibility into your data. You are generating revenue, but your data is all over the place (spreadsheets, dashboards of various tools, unstructured logs, etc.). There are important questions about your business that you can&rsquo;t answer because you&rsquo;re either not collecting the data, or because it&rsquo;s too hard to gather it into a coherent story. No one on your technical team has informed opinions on tools to pick out of the dozens of options for warehousing / ingestion / transformation / analytics / orchestration.</p><p>If this is the case, then <strong>data and machine learning isn&rsquo;t core to your product</strong>. You need someone to set up your data pipelines and analytics. These will primarily serve internal-facing use cases, like driving marketing decisions. However, <strong>the first question to ask is: Do you really need to hire someone for a permanent full-time position?</strong></p><p>Personally, <strong>I&rsquo;m biased in favour of not hiring (yet)</strong>: You can get started on your data journey with a contractor or a part-time (aka fractional) person. This should give you a better understanding of your data needs, and get you to a better place in terms of data infrastructure and dashboards. This person may also want to become a full-timer down the track, or help you with hiring other data people.</p><p><strong>Remember that – by definition – premature hiring unnecessarily shortens your runway.</strong> Hiring and onboarding a full-timer would usually take longer than bringing on an experienced contractor. And if you need to let them go, it may adversely affect team morale. This doesn&rsquo;t apply to contractors, who are expected to leave when their contract is over.</p><p>That said, <strong>there is value in retaining a long-term owner of your data and analytics.</strong> Every business, dataset, and data stack have their quirks, so the familiarity that comes with long-term ownership is a definite point in favour of hiring for a permanent role. That said, you should still be open to part-time if you don&rsquo;t have full-time needs yet.</p><h2 id=titles-and-skills>Titles and Skills<a hidden class=anchor aria-hidden=true href=#titles-and-skills>#</a></h2><p>If you do decide to hire for a permanent role, there are three other articles worth reading:</p><ul><li><a href=https://www.abartholomew.com/writing/your-first-data-hire target=_blank rel=noopener>Andrew Bartholomew</a> covers assumptions (similar to the above), responsibilities, skills, management, and the thorny question of <em>titles</em>. He says that the person&rsquo;s title is <em>&ldquo;the least important question [&mldr;] you&rsquo;re hiring a Senior Analytics Engineer or a Senior Data Analyst, but in practice this person might prefer a Senior Data Scientist title, or Analytics Lead, or something else.&rdquo;</em> I agree with this and pretty much everything else in Andrew&rsquo;s article, though it is important to align on the expectations implied by titles (more on this below).</li><li><a href=https://thesequel.substack.com/p/your-first-data-hire target=_blank rel=noopener>Colleen Tartow</a> advocates for hiring a senior data engineer. While Colleen&rsquo;s advice is sensible, I&rsquo;d be careful with following it blindly due to the messiness of titles and experiences. For example, you probably don&rsquo;t want a data engineer who&rsquo;s only worked with big companies, as there&rsquo;s a risk that they&rsquo;d over-engineer your data stack (initially, you&rsquo;re aiming for a <a href=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/>minimum viable data stack</a>). Also, if they&rsquo;ve only ever worn the data engineer hat, they may find it hard to uncover and communicate the insights you&rsquo;re after.</li><li><a href=https://www.linkedin.com/pulse/when-how-hire-your-startups-first-data-person-sebastian-hewing/ target=_blank rel=noopener>Sebastian Hewing</a> goes deep into the question of timing the hire as a function of product-market fit. I agree with most points, but disagree with this phrasing: <em>&ldquo;The last person you want is a Data Scientist. [&mldr;] What you need, in my opinion, is a Head of Data & Analytics.&rdquo;</em> I believe that someone who has <em>full-stack</em> data science experience may make a great Head of Data & Analytics – it all comes down to skills and experiences rather than past titles, which can only ever tell a part of the story. That said, Sebastian does list a bunch of other data titles that the startup <em>shouldn&rsquo;t</em> hire, so we probably agree on the essence of the role and the person. I especially like Sebastian&rsquo;s emphasis on seeking a hands-on data person who can <em>turn data into insights</em> AND <em>insights into action</em>.</li></ul><p>As you can see, the three articles disagree on the question of titles, with Andrew&rsquo;s being the most pragmatic. If you want to get even more confused, ask ChatGPT to summarise the collective wisdom of the internet: When I asked it <em>&ldquo;what should a startup&rsquo;s first data hire be?&rdquo;</em>, ChatGPT suggested seven(!) roles with an &ldquo;it depends&rdquo; reason for each one. Personally, <strong>I&rsquo;d go for a senior data generalist with an engineering background, who is also attentive to the business side</strong>. It&rsquo;s highly doubtful you&rsquo;d find someone who goes by this title, so you&rsquo;ll need to figure out how to find and attract them. This is hard if you&rsquo;re not familiar with the data space. It&rsquo;s worth seeking help from data folks in your network, or starting with a contractor to bootstrap the process.</p><h2 id=summary>Summary<a hidden class=anchor aria-hidden=true href=#summary>#</a></h2><p>Putting it all together, once you&rsquo;ve read the above articles, my opinion is that you should:</p><ol><li>Get clear on the business needs that&rsquo;d be addressed by a data person.</li><li>Err on the side of not hiring prematurely – consider a contractor or rely on your current employees.</li><li>When you&rsquo;re ready to hire, sketch out a high-level plan for the person&rsquo;s first 90-180-360 days.</li><li>Run the plan and job description by some data people you trust.<ul><li>Possible title for the job ad: <em>Data & Analytics Lead</em> or <em>Head of Data & Analytics</em> (but you want a hands-on person, so make it clear that this is an individual contributor role initially).</li><li>Make the plan a part of the job ad – it helps with aligning expectations.</li></ul></li><li>Ideally, get data people you trust to help you with the hiring process.</li><li>Screen out specialists early, regardless of past titles and pedigree.</li><li>Make expectations as clear as possible during the hiring process – especially if the person hasn&rsquo;t worked with a startup before.</li><li>Hire someone who&rsquo;s a great fit who would help take your business to the next level.</li></ol><p>Any thoughts or suggestions? Please <a href=https://yanirseroussi.com/contact/>contact me</a> – I will make edits to this post based on feedback.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on x" href="https://x.com/intent/tweet/?text=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f&amp;hashtags=analytics%2cbusiness%2ccareer%2cdataengineering%2cdatascience%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f&amp;title=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist&amp;summary=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f&title=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on whatsapp" href="https://api.whatsapp.com/send?text=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on telegram" href="https://telegram.me/share/url?text=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Substance over titles: Your first data hire may be a data scientist on ycombinator" href="https://news.ycombinator.com/submitlink?t=Substance%20over%20titles%3a%20Your%20first%20data%20hire%20may%20be%20a%20data%20scientist&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f05%2fsubstance-over-titles-your-first-data-hire-may-be-a-data-scientist%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/index.html b/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/index.html
index 998962897..69b1585b2 100644
--- a/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/index.html
+++ b/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/index.html
@@ -15,7 +15,7 @@
 https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/cheatsheet-final-pdf-attempt.webp 826w," src=https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/cheatsheet-final-pdf-attempt_hua0a7c998a0b13e6d00449e04939ae63c_77530_800x0_resize_q75_h2_box_2.webp alt="ChatGPT's final cheatsheet attempt (screenshot of the PDF)" loading=lazy></a></figure><p>Unfortunately, the cheatsheet partly lost touch with the book summary, but the bits ChatGPT decided to add aren&rsquo;t terrible. Also, the drive was over, so it was time to bring the cheatsheet game to an end.</p><h2 id=conclusion>Conclusion<a hidden class=anchor aria-hidden=true href=#conclusion>#</a></h2><p>While this may seem like a pointless exercise, I&rsquo;m pleased with these outcomes:</p><ul><li>Relearning that getting ChatGPT to elaborate on summary bullet points can be useful – or at least somewhat entertaining.</li><li>Coming up with the HTML to PDF export path as a way to get ChatGPT to produce nice-looking PDFs.</li><li>Producing a lovely cheatsheet that contains some sound advice.</li></ul><p>Overall, I still find ChatGPT mind-blowing. My usage of it reduced a bit last year, but since I got the Plus subscription, it&rsquo;s a completely different story. And the really amazing thing is that it&rsquo;s still early days for this technology. Exciting times!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on x" href="https://x.com/intent/tweet/?text=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f&amp;hashtags=artificialintelligence%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f&amp;title=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read&amp;summary=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f&title=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on whatsapp" href="https://api.whatsapp.com/send?text=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on telegram" href="https://telegram.me/share/url?text=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Nudging ChatGPT to invent books you have no time to read on ycombinator" href="https://news.ycombinator.com/submitlink?t=Nudging%20ChatGPT%20to%20invent%20books%20you%20have%20no%20time%20to%20read&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f12%2fnudging-chatgpt-to-invent-books-you-have-no-time-to-read%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/02/19/building-your-startups-minimum-viable-data-stack/index.html b/2024/02/19/building-your-startups-minimum-viable-data-stack/index.html
index 75d4c4ad7..5051caf17 100644
--- a/2024/02/19/building-your-startups-minimum-viable-data-stack/index.html
+++ b/2024/02/19/building-your-startups-minimum-viable-data-stack/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data engineering,data strategy,startups"><meta name=description content="First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Building your startup's minimum viable data stack"><meta property="og:description" content="First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/"><meta property="og:image" content="https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-02-19T00:00:00+00:00"><meta property="article:modified_time" content="2024-02-19T11:25:54+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing.webp"><meta name=twitter:title content="Building your startup's minimum viable data stack"><meta name=twitter:description content="First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Building your startup's minimum viable data stack","item":"https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Building your startup's minimum viable data stack","name":"Building your startup\u0027s minimum viable data stack","description":"First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations.","keywords":["data engineering","data strategy","startups"],"articleBody":"In my post on your startup’s first data hire, I noted in passing that the hire’s initial role would be setting up the company’s minimum viable data stack. But what exactly is that?\nConceptually, a minimum viable data stack follows the same principles as a minimum viable product. Breaking it up, it is:\nMinimal: You don’t want to over-build beyond your resources. Instead, set up the simplest stack that satisfies the startup’s near-term needs. Then iterate based on feedback and new requirements. Viable: While this is sometimes forgotten in favour of an over-emphasis on minimality, your data stack has to be viable. That is, it has to satisfy stakeholder needs through every iteration, as shown by the classic drawing above. Data Stack: This is the product that’s getting shipped and built iteratively, consisting of the components listed below. As in my previous post, I’m assuming a startup where the data stack initially serves internal stakeholders. The main difference between a consumer-facing product and an internal-facing data stack is that the latter has users you can easily talk to and fewer unknowns. This makes the task of satisfying user needs easier. This post is the first in a series that will serve as a quick reference for those embarking on the journey of setting up a minimum viable data stack. Future posts will go deeper into each of the key components. However, I can only cover limited ground – readers are encouraged to consult books such as Fundamentals of Data Engineering for a more thorough treatment of the topics covered in the series.\nComponents of a minimum viable data stack As we’re talking about a minimum viable data stack, this list of components isn’t exhaustive. The components I consider to be the bare minimum are:\nStorage: Where the data lives. Ingestion: How the data makes it into storage. Transformation: The layer that joins and changes raw data into more useful form. Analytics: Presentation layer, which can be consumed by non-technical stakeholders. Components that are perhaps conspicuous in their absence are: machine learning / AI (higher on data’s hierarchy of needs – I assume this will come later), data serving / querying (implicitly included in other layers), and orchestration (also implicit – I assume dependencies are initially simple enough so that any orchestration approach would work).\nConsiderations beyond stack components There are at least two critical items to think of early on. I consider a data stack to be nonviable if no thought is given to:\nSecurity. Ignore data security at your own peril. Examples abound of serious data breaches, which are often the result of trivial mistakes or poor design decisions. Starting off by implementing security checklists and following best practices like the principle of least privilege is way easier than trying to enforce them later on. A good technical read on the topic is Building Secure and Reliable Systems, though you’d need to cherry-pick principles to match your needs (it’s a book by Google, and your startup is not Google). Privacy. You should be aware of compliance requirements for the data you store, especially when it comes to private and sensitive data. As with security, it’s much easier to start by complying with privacy requirements than retrofitting a stack once stakeholders have come to depend on data that shouldn’t be collected or retained. In the spirit of minimality, it’s best to err on the side of not storing sensitive data when it isn’t required. This also helps minimise the potential effect of breaches. Other key considerations include:\nData generation. I assume that the business is generating data from multiple sources, which need to be ingested into a single storage system to ultimately drive decisions. If there’s only one source system, it may be too early for a data hire, or for a more sophisticated data stack. When considering data sources, it’s important to keep in mind the three Vs of data: Volume, Velocity, and Variety. Quality assurance. It’s important to have some automated checks in place to avoid breaking pipelines and maintain high data quality before making changes to production systems (e.g., changing transformation or ingestion code). Again, it’s easier to start with high standards for quality than enforce them retrospectively. Low quality is likely to result in low trust of any data or insights, making the stack nonviable. Observability/monitoring/incident response. Inevitably, things will break in production. With good observability and incident response practices, the data team will proactively address such issues – ideally before any stakeholders notice. This goes hand in hand with setting high quality standards – the sort of culture that is easier to set early on than change down the track. Data management. There are many items that fall under data management (e.g., see the list on Wikipedia). Many of them are addressed implicitly or not addressed at the early stages of a data stack. For example, data discovery isn’t a major issue when the data team consists of a single person. Still, it’s worth being aware of management considerations that arise as the data stack matures. Timely automation. As a broad generalisation, engineers like automation. Erring on the side of automation is often a good idea, as it lets computers do what they do best and frees up human time to deal with things that have to be done manually. Done right, automation increases overall quality. However, creating automations takes time, e.g., if a monthly report takes five minutes to generate, and it’d take a day of coding to automate, it’s probably enough to write up the procedure to generate it. You have bigger fish to fry. The need to be boring. Another trap that you can easily fall into is trying shiny new tools. Despite what vendors might say, it’s rare for new tools to be truly transformative. You should strive to be boring in your choice of components. Use proven tools and services for the minimum viable data stack, keeping the shiny experimental stuff to your hobby side projects (I learnt this the hard way). Speed of iteration. Some people believe that high quality always comes at the cost of iteration speed. I disagree, for the same reasons Martin Fowler pointed out in an essay on how increasing the internal quality of software increases iteration speed within weeks of the start of a project. In short, if you don’t invest in quality, you’re committing yourself to spending much of your time firefighting as the complexity of the data stack increases. However, overthinking reversible decisions or spending too much time on non-critical issues is also a real possibility. Above all, you must remember that the goal of the data stack is to support business decisions, which may require some compromises to deliver value as rapidly as possible. Next: Choosing components As noted, I’m aiming for this to be the first in a series of posts on setting up a minimum viable data stack. Each future post will be dedicated to each of the key components, going deeper into currently-available tools for storage, ingestion, transformation, and analytics. The focus will be on tools that are sensible to use by startups.\nStay tuned for future posts! In the meantime, feedback is always welcome.\n","wordCount":"1197","inLanguage":"en","image":"https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing.webp","datePublished":"2024-02-19T00:00:00Z","dateModified":"2024-02-19T11:25:54+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Building your startup's minimum viable data stack</h1><div class=post-meta><span title='2024-02-19 00:00:00 +0000 UTC'>February 19, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing_hu7e6092f63b0a3162cc9fd4ba3406cfc1_57908_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing_hu7e6092f63b0a3162cc9fd4ba3406cfc1_57908_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing_hu7e6092f63b0a3162cc9fd4ba3406cfc1_57908_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing_hu7e6092f63b0a3162cc9fd4ba3406cfc1_57908_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/henrik-kniberg-minimum-viable-product-drawing.webp alt="Henrik Kniberg's drawing of a minimum viable product, showing the wrong way of doing it (non-functional iterations) and the right way of doing it (functional iterations)." width=1200 height=630><p>Minimum viability is about rapidly delivering incremental value. Source: <a href=https://blog.crisp.se/2016/01/25/henrikkniberg/making-sense-of-mvp target=_blank rel=noopener>Henrik Kniberg</a>.</p></figure><div class=post-content><p>In my post on <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>your startup&rsquo;s first data hire</a>, I noted in passing that the hire&rsquo;s initial role would be setting up the company&rsquo;s <em>minimum viable data stack</em>. But what exactly is that?</p><p>Conceptually, a minimum viable data stack follows the same principles as a minimum viable product. Breaking it up, it is:</p><ul><li><strong>Minimal:</strong> You don&rsquo;t want to over-build beyond your resources. Instead, set up the simplest stack that satisfies the startup&rsquo;s near-term needs. Then iterate based on feedback and new requirements.</li><li><strong>Viable:</strong> While this is sometimes forgotten in favour of an over-emphasis on minimality, your data stack <em>has</em> to be viable. That is, it has to satisfy stakeholder needs through every iteration, as shown by the classic drawing above.</li><li><strong>Data Stack:</strong> This is the product that&rsquo;s getting shipped and built iteratively, consisting of the components listed below. As in my previous post, I&rsquo;m assuming a startup where the data stack initially serves internal stakeholders. The main difference between a consumer-facing product and an internal-facing data stack is that the latter has users you can easily talk to and fewer unknowns. This makes the task of satisfying user needs easier.</li></ul><p>This post is the first in a series that will serve as a quick reference for those embarking on the journey of setting up a minimum viable data stack. Future posts will go deeper into each of the key components. However, I can only cover limited ground – readers are encouraged to consult books such as <a href=https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/ target=_blank rel=noopener>Fundamentals of Data Engineering</a> for a more thorough treatment of the topics covered in the series.</p><h2 id=components-of-a-minimum-viable-data-stack>Components of a minimum viable data stack<a hidden class=anchor aria-hidden=true href=#components-of-a-minimum-viable-data-stack>#</a></h2><p>As we&rsquo;re talking about a <em>minimum</em> viable data stack, this list of components isn&rsquo;t exhaustive. The components I consider to be the bare minimum are:</p><ol><li><strong>Storage:</strong> Where the data lives.</li><li><strong>Ingestion:</strong> How the data makes it into storage.</li><li><strong>Transformation:</strong> The layer that joins and changes raw data into more useful form.</li><li><strong>Analytics:</strong> Presentation layer, which can be consumed by non-technical stakeholders.</li></ol><p>Components that are perhaps conspicuous in their absence are: machine learning / AI (higher on <a href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/>data&rsquo;s hierarchy of needs</a> – I assume this will come later), data serving / querying (implicitly included in other layers), and orchestration (also implicit – I assume dependencies are initially simple enough so that any orchestration approach would work).</p><h2 id=considerations-beyond-stack-components>Considerations beyond stack components<a hidden class=anchor aria-hidden=true href=#considerations-beyond-stack-components>#</a></h2><p>There are at least two critical items to think of early on. I consider a data stack to be nonviable if no thought is given to:</p><ul><li><strong>Security.</strong> Ignore data security at your own peril. Examples abound of serious data breaches, which are often the result of trivial mistakes or poor design decisions. Starting off by implementing <a href=https://security-list.js.org/#/README target=_blank rel=noopener>security checklists</a> and following best practices like <a href=https://en.wikipedia.org/wiki/Principle_of_least_privilege target=_blank rel=noopener>the principle of least privilege</a> is <em>way</em> easier than trying to enforce them later on. A good technical read on the topic is <a href=https://www.google.com/books/edition/Building_Secure_and_Reliable_Systems/Kn7UxwEACAAJ target=_blank rel=noopener>Building Secure and Reliable Systems</a>, though you&rsquo;d need to cherry-pick principles to match your needs (it&rsquo;s a book by Google, and your startup is not Google).</li><li><strong>Privacy.</strong> You should be aware of compliance requirements for the data you store, especially when it comes to private and sensitive data. As with security, it&rsquo;s <em>much</em> easier to start by complying with privacy requirements than retrofitting a stack once stakeholders have come to depend on data that shouldn&rsquo;t be collected or retained. In the spirit of minimality, it&rsquo;s best to err on the side of <em>not</em> storing sensitive data when it isn&rsquo;t required. This also helps minimise the potential effect of breaches.</li></ul><p>Other key considerations include:</p><ul><li><strong>Data generation.</strong> I assume that the business is generating data from multiple sources, which need to be ingested into a single storage system to ultimately drive decisions. If there&rsquo;s only one source system, it may be too early for a data hire, or for a more sophisticated data stack. When considering data sources, it&rsquo;s important to keep in mind the three Vs of data: Volume, Velocity, and Variety.</li><li><strong>Quality assurance.</strong> It&rsquo;s important to have some automated checks in place to avoid breaking pipelines and maintain high data quality <em>before</em> making changes to production systems (e.g., changing transformation or ingestion code). Again, it&rsquo;s easier to start with high standards for quality than enforce them retrospectively. Low quality is likely to result in low trust of any data or insights, making the stack nonviable.</li><li><strong>Observability/monitoring/incident response.</strong> Inevitably, things will break in production. With good observability and incident response practices, the data team will proactively address such issues – ideally before any stakeholders notice. This goes hand in hand with setting high quality standards – the sort of culture that is easier to set early on than change down the track.</li><li><strong>Data management.</strong> There are many items that fall under data management (e.g., see <a href=https://en.wikipedia.org/wiki/Data_management target=_blank rel=noopener>the list on Wikipedia</a>). Many of them are addressed implicitly or not addressed at the early stages of a data stack. For example, data discovery isn&rsquo;t a major issue when the data team consists of a single person. Still, it&rsquo;s worth being aware of management considerations that arise as the data stack matures.</li><li><strong>Timely automation.</strong> As a broad generalisation, engineers like automation. Erring on the side of automation is often a good idea, as it lets computers do what they do best and frees up human time to deal with things that have to be done manually. Done right, automation increases overall quality. However, <a href=https://xkcd.com/1205/ target=_blank rel=noopener>creating automations takes time</a>, e.g., if a monthly report takes five minutes to generate, and it&rsquo;d take a day of coding to automate, it&rsquo;s probably enough to write up the procedure to generate it. You have bigger fish to fry.</li><li><strong>The need to be boring.</strong> Another trap that you can easily fall into is trying shiny new tools. Despite what vendors might say, it&rsquo;s rare for new tools to be truly transformative. You should strive to be boring in your choice of components. Use proven tools and services for the minimum viable data stack, keeping the shiny experimental stuff to your hobby side projects (<a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>I learnt this the hard way</a>).</li><li><strong>Speed of iteration.</strong> Some people believe that high quality always comes at the cost of iteration speed. I disagree, for the same reasons Martin Fowler pointed out in <a href=https://martinfowler.com/articles/is-quality-worth-cost.html target=_blank rel=noopener>an essay on how increasing the <em>internal</em> quality of software increases iteration speed within <em>weeks</em> of the start of a project</a>. In short, if you don&rsquo;t invest in quality, you&rsquo;re committing yourself to spending much of your time firefighting as the complexity of the data stack increases. However, overthinking reversible decisions or spending too much time on non-critical issues is also a real possibility. Above all, you must remember that <strong>the goal of the data stack is to support business decisions, which may require some compromises to deliver value as rapidly as possible.</strong></li></ul><h2 id=next-choosing-components>Next: Choosing components<a hidden class=anchor aria-hidden=true href=#next-choosing-components>#</a></h2><p>As noted, I&rsquo;m aiming for this to be the first in a series of posts on setting up a minimum viable data stack. Each future post will be dedicated to each of the key components, going deeper into currently-available tools for storage, ingestion, transformation, and analytics. The focus will be on tools that are sensible to use by startups.</p><p>Stay tuned for future posts! In the meantime, <a href=https://yanirseroussi.com/contact/>feedback is always welcome</a>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on x" href="https://x.com/intent/tweet/?text=Building%20your%20startup%27s%20minimum%20viable%20data%20stack&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f&amp;hashtags=dataengineering%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f&amp;title=Building%20your%20startup%27s%20minimum%20viable%20data%20stack&amp;summary=Building%20your%20startup%27s%20minimum%20viable%20data%20stack&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f&title=Building%20your%20startup%27s%20minimum%20viable%20data%20stack"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on whatsapp" href="https://api.whatsapp.com/send?text=Building%20your%20startup%27s%20minimum%20viable%20data%20stack%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on telegram" href="https://telegram.me/share/url?text=Building%20your%20startup%27s%20minimum%20viable%20data%20stack&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Building your startup's minimum viable data stack on ycombinator" href="https://news.ycombinator.com/submitlink?t=Building%20your%20startup%27s%20minimum%20viable%20data%20stack&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f19%2fbuilding-your-startups-minimum-viable-data-stack%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/02/26/avoiding-ai-complexity-first-write-no-code/index.html b/2024/02/26/avoiding-ai-complexity-first-write-no-code/index.html
index f8bf42675..f069a044d 100644
--- a/2024/02/26/avoiding-ai-complexity-first-write-no-code/index.html
+++ b/2024/02/26/avoiding-ai-complexity-first-write-no-code/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,data strategy,machine learning,software engineering,startups"><meta name=description content="Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Avoiding AI complexity: First, write no code"><meta property="og:description" content="Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/"><meta property="og:image" content="https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-02-26T01:45:00+00:00"><meta property="article:modified_time" content="2024-03-04T12:39:10+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere.webp"><meta name=twitter:title content="Avoiding AI complexity: First, write no code"><meta name=twitter:description content="Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Avoiding AI complexity: First, write no code","item":"https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Avoiding AI complexity: First, write no code","name":"Avoiding AI complexity: First, write no code","description":"Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach.","keywords":["artificial intelligence","data strategy","machine learning","software engineering","startups"],"articleBody":"Custom software is notoriously hard to build and maintain. Machine learning (ML) adds a layer of complexity on top of traditional software, with many novel ways to accrue technical debt. Therefore, my general advice to young startups considering custom ML development borrows from the first and second rules of optimisation:\nDon’t. Don’t Yet (for experts only). For startups where Data \u0026 AI/ML isn’t a core part of the team’s capabilities, there are usually higher priorities than building custom ML models. However, deriving commercial value from advancements in AI is still possible – even without writing code at all.\nI recently witnessed two stories that exemplify this point.\nExhibit A. Consider this story: A technical lead reached out to me for advice on a computer vision project that wasn’t progressing as expected. For the sake of illustration, let’s say it was a custom model to classify food pictures as hotdog / not hotdog.\nThe company had contracted an ML engineer to drive the project. Despite having no background in ML, the lead felt like the contractor was going down the wrong track, and asked me for my thoughts.\nIt turned out that the contractor believed that the best path forward was trying different model architectures. My advice was to ensure the contractor did the data work first. Often, there are bigger gains to be had from data augmentations than from model tweaks (e.g., applying distortions to the hotdog photos). As the data wasn’t sensitive, I also suggested trying third-party computer vision APIs or GPT-4 Vision to get an idea of what’s possible with the dataset.\nExhibit B. Recently, I caught up with an entrepreneur who comes from a marketing background. Just prior to our meeting, they had successfully pitched an app they had built to a large client.\nRemarkably, the app included a hotdog detector similar to the one the ML engineer was struggling to ship. The entrepreneur used the FlutterFlow no-code platform along with Google’s computer vision APIs to rapidly create an app with commercial value – without deep knowledge of ML.\nExpanding the rules No two companies are exactly alike. Sometimes, custom code or ML models are necessary. However, given the pace of innovation in no-code and low-code software \u0026 AI, starting with the least code possible is often a wise choice. Those who build software as their craft often have a blind spot when it comes to no-code possibilities – coders gonna code. However, it’s important to rein in the coding instinct. The difference in total cost between a custom build and using third-party APIs or no-code solutions can easily be in six or seven figures.\nWhen contemplating custom ML and AI development, consider the following options:\nDon’t build it. Wait until it becomes easier. Use a no-code solution. Get a software engineer to implement it with third-party APIs. Get a software engineer to implement it with third-party models that you self-host (with minimal customisation). Get the experts to build it: ML engineers, data scientists, and data engineers. You should be pretty certain that the cost of Option 6 is worth the investment. One way to get there is by starting with one of Options 3-5, thereby proving (or disproving) that there’s commercial value in the most expensive option. And when the time comes for Option 6, always do the data work!\n","wordCount":"554","inLanguage":"en","image":"https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere.webp","datePublished":"2024-02-26T01:45:00Z","dateModified":"2024-03-04T12:39:10+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Avoiding AI complexity: First, write no code</h1><div class=post-meta><span title='2024-02-26 01:45:00 +0000 UTC'>February 26, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere_huffb26b7c0dd0130024613460683c73d0_226926_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere_huffb26b7c0dd0130024613460683c73d0_226926_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere_huffb26b7c0dd0130024613460683c73d0_226926_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere_huffb26b7c0dd0130024613460683c73d0_226926_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/first-write-no-code-primum-non-codere.webp alt="Illustration showing ancient stone tablets with the inscription 'primum non codere' (inspired by primum non noncere: first, do no harm)" width=1200 height=630></figure><div class=post-content><p>Custom software is notoriously hard to build and maintain. Machine learning (ML) adds a layer of complexity on top of traditional software, with <a href=https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf target=_blank rel=noopener>many novel ways to accrue technical debt</a>. Therefore, my general advice to young startups considering custom ML development borrows from <a href=https://wiki.c2.com/?RulesOfOptimization target=_blank rel=noopener>the first and second rules of optimisation</a>:</p><ol><li>Don&rsquo;t.</li><li>Don&rsquo;t Yet (for experts only).</li></ol><p>For startups where Data & AI/ML isn&rsquo;t a core part of the team&rsquo;s capabilities, there are usually higher priorities than building custom ML models. However, deriving commercial value from advancements in AI is still possible – even without writing code at all.</p><p>I recently witnessed two stories that exemplify this point.</p><p><strong>Exhibit A.</strong> Consider this story: A technical lead reached out to me for advice on a computer vision project that wasn&rsquo;t progressing as expected. For the sake of illustration, let&rsquo;s say it was a custom model to <a href=https://www.theverge.com/tldr/2017/5/14/15639784/hbo-silicon-valley-not-hotdog-app-download target=_blank rel=noopener>classify food pictures as hotdog / not hotdog</a>.</p><p>The company had contracted an ML engineer to drive the project. Despite having no background in ML, the lead felt like the contractor was going down the wrong track, and asked me for my thoughts.</p><p>It turned out that the contractor believed that the best path forward was trying different model architectures. My advice was to ensure the contractor did the data work first. Often, <a href=https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0 target=_blank rel=noopener>there are bigger gains to be had from data augmentations than from model tweaks</a> (e.g., applying distortions to the hotdog photos). As the data wasn&rsquo;t sensitive, I also suggested trying third-party computer vision APIs or GPT-4 Vision to get an idea of what&rsquo;s possible with the dataset.</p><p><strong>Exhibit B.</strong> Recently, I caught up with an entrepreneur who comes from a marketing background. Just prior to our meeting, they had successfully pitched an app they had built to a large client.</p><p>Remarkably, <strong>the app included a hotdog detector similar to the one the ML engineer was struggling to ship.</strong> The entrepreneur used the <a href=https://flutterflow.io/ target=_blank rel=noopener>FlutterFlow</a> no-code platform along with Google&rsquo;s computer vision APIs to rapidly create an app with commercial value – without deep knowledge of ML.</p><h2 id=expanding-the-rules>Expanding the rules<a hidden class=anchor aria-hidden=true href=#expanding-the-rules>#</a></h2><p>No two companies are exactly alike. Sometimes, custom code or ML models are necessary. However, given the pace of innovation in no-code and low-code software & AI, starting with the least code possible is often a wise choice. Those who build software as their craft often have a blind spot when it comes to no-code possibilities – <em>coders gonna code</em>. However, it&rsquo;s important to rein in the coding instinct. The difference in total cost between a custom build and using third-party APIs or no-code solutions can easily be in six or seven figures.</p><p>When contemplating custom ML and AI development, consider the following options:</p><ol><li>Don&rsquo;t build it.</li><li><a href=https://www.oneusefulthing.org/p/the-lazy-tyranny-of-the-wait-calculation target=_blank rel=noopener>Wait until it becomes easier</a>.</li><li>Use a no-code solution.</li><li>Get a software engineer to implement it with third-party APIs.</li><li>Get a software engineer to implement it with third-party models that you self-host (with minimal customisation).</li><li>Get the experts to build it: ML engineers, data scientists, and data engineers.</li></ol><p>You should be pretty certain that the cost of Option 6 is worth the investment. One way to get there is by starting with one of Options 3-5, thereby proving (or disproving) that there&rsquo;s commercial value in the most expensive option. And when the time comes for Option 6, <a href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/>always do the data work</a>!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on x" href="https://x.com/intent/tweet/?text=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f&amp;hashtags=artificialintelligence%2cdatastrategy%2cmachinelearning%2csoftwareengineering%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f&amp;title=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code&amp;summary=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f&title=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on whatsapp" href="https://api.whatsapp.com/send?text=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on telegram" href="https://telegram.me/share/url?text=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Avoiding AI complexity: First, write no code on ycombinator" href="https://news.ycombinator.com/submitlink?t=Avoiding%20AI%20complexity%3a%20First%2c%20write%20no%20code&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f02%2f26%2favoiding-ai-complexity-first-write-no-code%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/03/04/two-types-of-startup-data-problems/index.html b/2024/03/04/two-types-of-startup-data-problems/index.html
index 691ea19a3..4e5c508f6 100644
--- a/2024/03/04/two-types-of-startup-data-problems/index.html
+++ b/2024/03/04/two-types-of-startup-data-problems/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/machine-learning-system.webp 1333w," src=https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/machine-learning-system_hu39175a98128e26abad0a91492df5a70e_35082_800x0_resize_q75_h2_box_2.webp alt="Diagram showing ML Code as a tiny part of a larger machine learning system." loading=lazy></a><figcaption><p>Source: <a href=https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf target=_blank rel=noopener>Hidden Technical Debt in Machine Learning Systems</a></p></figcaption></figure><p><strong>Side note: LLMs and black-box APIs.</strong> If LLMs are a core part of the product, I still consider it to be an ML-centric startup. I&rsquo;m not sure if the term <a href=https://www.databricks.com/glossary/llmops target=_blank rel=noopener>LLMOps</a> will catch on, but it has a lot in common with MLOps. Likewise, <a href=https://jxnl.github.io/blog/writing/2024/01/07/inverted-thinking-rag/ target=_blank rel=noopener>using LLMs for retrieval-augmented generation is similar to building recommender systems</a>, i.e., ML-centric. The same reasoning applies to black-box ML APIs: If they form a core part of the product, it&rsquo;s an ML-centric startup because you need to think of data and metrics early on.</p><h2 id=examples-from-my-past>Examples from my past<a hidden class=anchor aria-hidden=true href=#examples-from-my-past>#</a></h2><p>My employment history includes work with both ML-centric and non-ML startups. These examples may help clarify the differences between the two startup types:</p><ul><li><strong>ML-centric startup:</strong> After my PhD, I was a founding data scientist with a startup called Giveable, where the product was <a href=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/>a recommender system for gifts</a>. Giveable disbanded, but I took the codebase to Next Commerce – a company that had a few products in the e-commerce space. There, I led the team that turned Giveable into Hynt – a recommender system as a service.</li><li><strong>Non-ML startup:</strong> I was the first data hire at Car Next Door (now Uber Carshare). Despite my fancy <em>Head of Data Science</em> title (data science was still hyped up at the time), I did a lot of engineering work – including data & analytics engineering. I also built ML-ish models of customer lifetime value, but it was too early in the company&rsquo;s life for anything too sophisticated on the ML front.</li><li><strong>ML work at a non-ML scaleup:</strong> After Car Next Door, <a href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/>I spent 4.5 years at Automattic</a>. The company&rsquo;s headcount grew about 3-4 times in my time there (from about 500 employees). This growth included investment in data and ML: One major project I worked on was ML pipelines to improve marketing performance (e.g., automatically target customers that are most likely to upgrade as a result of a well-timed email). However, I was also involved in data-intensive projects that didn&rsquo;t include ML.</li><li><strong>ML-centric product with a non-ML startup:</strong> After Automattic, <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>I joined Orkestra</a> to help them build a new product that had ML at its core. However, the company&rsquo;s main product wasn&rsquo;t an ML product, and I left on good terms when they pivoted to focus on their main offering.</li></ul><p>With Giveable/Hynt and Orkestra, attempting ML product development without thinking of MLOps wasn&rsquo;t going to work. With Car Next Door and Automattic, the company&rsquo;s success never depended on MLOps, so an incremental approach to using data and ML was viable.</p><h2 id=closing-thoughts>Closing thoughts<a hidden class=anchor aria-hidden=true href=#closing-thoughts>#</a></h2><p>While both ML-centric and non-ML startups face data problems, the centrality of data varies between the two. Trying to run an ML-centric startup without a solid grasp of MLOps and data engineering practices is a recipe for failure, while non-ML startups can get away with less-than-ideal data practices for a long time.</p><p>Personally, I&rsquo;m always on the lookout for better ways of explaining these differences and coming up with accessible terminology to help founders who are navigating the space. ML-centric and non-ML will do for now, but other suggestions are welcome!</p><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>This is a fine example of an advantage of writing publicly. The initial version of this post didn&rsquo;t include the qualification of <em>&ldquo;unless they&rsquo;re building a data-intensive product&rdquo;</em> – I realised it was missing the following day. Perhaps a better classification is data-centric versus data-supported, but I&rsquo;ll leave that to a future post.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on x" href="https://x.com/intent/tweet/?text=Two%20types%20of%20startup%20data%20problems&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f&amp;hashtags=artificialintelligence%2cdatastrategy%2cmachinelearning%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f&amp;title=Two%20types%20of%20startup%20data%20problems&amp;summary=Two%20types%20of%20startup%20data%20problems&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f&title=Two%20types%20of%20startup%20data%20problems"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on whatsapp" href="https://api.whatsapp.com/send?text=Two%20types%20of%20startup%20data%20problems%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on telegram" href="https://telegram.me/share/url?text=Two%20types%20of%20startup%20data%20problems&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Two types of startup data problems on ycombinator" href="https://news.ycombinator.com/submitlink?t=Two%20types%20of%20startup%20data%20problems&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f04%2ftwo-types-of-startup-data-problems%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/index.html b/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/index.html
index 93d467f0c..effb479ad 100644
--- a/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/index.html
+++ b/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,data science,machine learning,software engineering"><meta name=description content="Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Questions to consider when using AI for PDF data extraction"><meta property="og:description" content="Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/"><meta property="og:image" content="https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-03-11T00:00:00+00:00"><meta property="article:modified_time" content="2024-03-11T15:53:13+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover.webp"><meta name=twitter:title content="Questions to consider when using AI for PDF data extraction"><meta name=twitter:description content="Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Questions to consider when using AI for PDF data extraction","item":"https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Questions to consider when using AI for PDF data extraction","name":"Questions to consider when using AI for PDF data extraction","description":"Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents.","keywords":["artificial intelligence","data science","machine learning","software engineering"],"articleBody":"The jagged frontier of AI means that you can’t always know which tasks are within the capabilities of current models. One such task is the extraction of structured data from PDFs. While this is fully within the capabilities of humans, there are unique challenges in getting off-the-shelf AIs like ChatGPT to do it well. Variants of this task have come up repeatedly in my recent discussions and work, so I put together this summary of my understanding of when it makes sense to try to automate PDF data extraction with AI.\nThis post is structured as a series of questions I’d ask about a proposed project. Actual answers will depend on the specific project.\nWhat is your budget? I assume there is some business value in the extracted data. Therefore, it’s possible to estimate the dollar value of automating the manual processes that are used to extract the data. There’s a big difference between an extraction process that takes a junior employee a couple of weeks per year and one that keeps data entry specialists busy year-round. The budget determines the tools that can be used: If it’s low (e.g., in the thousands), it’s probably only worth spending a few days assessing feasibility with off-the-shelf tools like the OpenAI APIs. If it’s higher (e.g., hundreds of thousands), paying AI engineers to build a bespoke system becomes an option.\nHow sensitive is the data? If the data you’re working with isn’t sensitive (e.g., financial statements by public companies), you’re in luck: You can use the best AIs available. A month ago, it was GPT-4. Today, you have a bunch of other proprietary options. Tomorrow, who knows?\nIf the data is sensitive and can’t leave your organisation’s systems, your options are more limited. Depending on other aspects of the problem, it may mean you’re better off waiting for better open models. However, given the rate of progress in the open source AI ecosystem, assessing feasibility is about as simple as with proprietary solutions. It’s just that right now, you won’t be using the most capable models.\nHow complex are the PDFs? There’s a wide variety of documents out there. If the PDFs you’re working with can be converted to text accurately, the AI models have a good chance of being able to extract the data you’re after. If the conversion to text is insufficiently accurate, the AIs stand little chance of outperforming data entry specialists – they just don’t see the PDFs as well as we do.\nIt’s worth spending some time on this question. For example, if you’re using OpenAI’s APIs and much of the data you’re looking to extract is contained in tables within PDFs, you can first test whether you can reliably retrieve and display specific tables. Sticking with the example of financial statements, there’s a big difference between this 124-page sample from Grant Thornton and this seven-page sample from the Australian Fair Work Commission. Prompting GPT-4 (via ChatGPT Plus) to extract a table from the former PDF produced output that was only loosely-connected to the actual content. By contrast, GPT-4 perfectly reproduced tables from the latter PDF.\nIn general, perfect conversion of PDF tables to text appears to be an unsolved problem. For example, in a benchmark from last year, the best tool tested only had about 50% accuracy when applied to tables in scientific papers. However, the field keeps moving. You may get an accuracy boost by treating PDFs as images and using vision models, as recommended by the Unstructured library. Anecdotally, I found Unstructured’s table parsing to be too inaccurate, but when I used GPT-4 Vision on screenshots of the same tables it yielded much better results. It also outperformed OpenAI’s default PDF parser. Your mileage will definitely vary.\nThings get even more complicated if some of the data you’re hoping to extract is in graphs and other figures contained in the PDFs. Verifying that the PDFs are not too complex for the AI models is definitely worth doing before jumping into more elaborate data extraction tasks.\nCan the tokenised PDFs fully fit in the model’s context window? The complexity of the PDFs determines how well the AI models can see them. Their length determines how much of them the models can see at once. There’s nuance depending on whether the PDFs are fully converted to text tokens or to text and images, but the number of tokens that can be fed in (i.e., the context window) is limited with the current generation of AI models. Context windows are rapidly expanding, with Google recently releasing a million-token model (approximately 700,000 words), but there’s still a cost per token that you need to consider when building solutions.\nIf you’re working with PDFs that are larger than your context windows, you’re probably going to use retrieval-augmented generation, i.e., break the documents into chunks and feed specific chunks into the model based on the query. With some tools and APIs, this may be handled for you (e.g., with OpenAI’s Assistants). However, it introduces a source of inaccuracies that may not be acceptable for your use case.\nWhat is your teaching approach? Assuming you’re satisfied that the AIs can see enough of the PDFs well enough to provide useful answers, it’s time to test different ways of teaching them about the data extraction tasks. The question of which approaches you can test is closely tied to your budget. But even with large budgets, it’s best to start simple and only attempt more complicated approaches if the simpler ones fail. In order of implementation complexity, key teaching approaches are:\nZero-shot: Describe the task and the expected output, provide a new PDF, and see if you get the expected output. Few-shot: In addition to describing the task and expected output, also provide examples of past PDFs and their extracted outputs. Given context window limitations, this is only feasible with relatively short PDFs and simple outputs. Fine-tuning: This goes beyond the sort of prompting that has become accessible to the general population via ChatGPT. The general idea is that you can get better results by teaching the underlying model about your expected inputs and outputs. Even if you don’t have machine learning experts on your team, you may get good results by following resources such as the fine-tuning guide by OpenAI. However, success isn’t guaranteed, so it’s important to manage expectations and budgets accordingly. It may well be the case that you’re better off waiting a few months or years for new AI models, rather than investing in fine-tuning experimentation. New models are likely to make lower-effort zero/few-shot results better. Custom models: Taking a step beyond fine-tuning, building custom machine learning models may be a good match for your budget and available expertise. However, you definitely won’t be doing it to automate a low-cost-low-frequency data entry process. Implicit in the above is the availability of some training \u0026 testing data (i.e., input PDFs and expected outputs). That is, no matter what approach you follow, you’d want to have some confidence that it works beyond a few test samples – use a large representative dataset to gain confidence in your solution.\nCan the AI model understand the input structures? This is closely related to the question of PDF complexity, but worth considering separately. I’m anthropomorphising a bit by talking about AI understanding, but just as they can’t see the same as we do, their level of understanding may also be unintuitive. For example, a recent paper that proposed a fine-tuning approach to improve GPT-3.5’s table understanding made the case that general language models can’t read tables reliably because:\nNatural language texts are (1) one-directional, (2) read left-to-right, where (3) swapping two tokens will generally change the meaning of a sentence. In contrast, relational tables are (1) two-dimensional in nature with both rows and columns, (2) where reading top-to-bottom in the vertical direction for values in the same column, is crucial in many table-tasks. Furthermore, unlike text, (3) tables are largely “invariant” to row and column permutations, where swapping two rows or columns do not generally change the semantic meaning of the table.\nThis argument is compelling, but given the emergent abilities of large language models and the fact that we no longer know what goes into proprietary models beyond GPT-3.5, I wouldn’t bet on these limitations being an issue for all tabular data. Again, experimenting with your specific use case is key. If you encounter issues, it’s worth probing the models to check if they exhibit any semantic understanding beyond just reproducing the inputs.\nCan the AI model understand and produce the output structures? If you’re building custom models, it’s straightforward to get exactly the output structure you want (e.g., a complex JSON). Otherwise, if you’re prompting a language model, you need to ask nicely and hope for the best. That said, there are strategies to get models to produce the output structures you want, such as using OpenAI’s function calling or the Outlines library. However, as with the example of tabular inputs, there is a difference between being able to produce an output that conforms to a specific output schema and populating the schema with values that make semantic sense. Breaking down complex outputs to simpler structures and using prompt chaining may be helpful in some cases.\nWhat is your long-term validation approach? Assuming you successfully build an AI solution that can replace manual data entry, should you completely stop manual data extraction? As with other questions, it depends on the use case, but it’s worth considering a gradual switch to full automation. For example, you can keep the manual process for 10% of new data to verify that the whole system works as expected. This is especially worth doing when working with publicly-available datasets, as there’s a non-zero chance that the models you’re using have seen the input training data before (though they probably haven’t seen your outputs).\nAnything else? If I missed important questions, please let me know and I will update this post.\n","wordCount":"1663","inLanguage":"en","image":"https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover.webp","datePublished":"2024-03-11T00:00:00Z","dateModified":"2024-03-11T15:53:13+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Questions to consider when using AI for PDF data extraction</h1><div class=post-meta><span title='2024-03-11 00:00:00 +0000 UTC'>March 11, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover_hu26d454933c933ac9c1be43dc275c5062_65412_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover_hu26d454933c933ac9c1be43dc275c5062_65412_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover_hu26d454933c933ac9c1be43dc275c5062_65412_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover_hu26d454933c933ac9c1be43dc275c5062_65412_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/cover.webp alt="Decorative image showing a book and flying documents with an AI-themed overlay" width=1200 height=630></figure><div class=post-content><p>The <a href=https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the-jagged target=_blank rel=noopener>jagged frontier</a> of AI means that you can&rsquo;t always know which tasks are within the capabilities of current models. One such task is the extraction of structured data from PDFs. While this is fully within the capabilities of humans, there are unique challenges in getting off-the-shelf AIs like ChatGPT to do it well. Variants of this task have come up repeatedly in my recent discussions and work, so I put together this summary of my understanding of when it makes sense to try to automate PDF data extraction with AI.</p><p>This post is structured as a series of questions I&rsquo;d ask about a proposed project. Actual answers will depend on the specific project.</p><h2 id=what-is-your-budget>What is your budget?<a hidden class=anchor aria-hidden=true href=#what-is-your-budget>#</a></h2><p>I assume there is some business value in the extracted data. Therefore, it&rsquo;s possible to estimate the dollar value of automating the manual processes that are used to extract the data. There&rsquo;s a big difference between an extraction process that takes a junior employee a couple of weeks per year and one that keeps data entry specialists busy year-round. The budget determines the tools that can be used: If it&rsquo;s low (e.g., in the thousands), it&rsquo;s probably only worth spending a few days assessing feasibility with off-the-shelf tools like the OpenAI APIs. If it&rsquo;s higher (e.g., hundreds of thousands), paying AI engineers to build a bespoke system becomes an option.</p><h2 id=how-sensitive-is-the-data>How sensitive is the data?<a hidden class=anchor aria-hidden=true href=#how-sensitive-is-the-data>#</a></h2><p>If the data you&rsquo;re working with isn&rsquo;t sensitive (e.g., financial statements by public companies), you&rsquo;re in luck: You can use the best AIs available. A month ago, it was GPT-4. Today, <a href=https://simonwillison.net/2024/Mar/8/gpt-4-barrier/ target=_blank rel=noopener>you have a bunch of other proprietary options</a>. Tomorrow, who knows?</p><p>If the data is sensitive and can&rsquo;t leave your organisation&rsquo;s systems, your options are more limited. Depending on other aspects of the problem, it may mean you&rsquo;re better off <a href=https://www.oneusefulthing.org/p/the-lazy-tyranny-of-the-wait-calculation target=_blank rel=noopener>waiting for better open models</a>. However, given the rate of progress in the open source AI ecosystem, assessing feasibility is about as simple as with proprietary solutions. It&rsquo;s just that right now, you won&rsquo;t be using the most capable models.</p><h2 id=how-complex-are-the-pdfs>How complex are the PDFs?<a hidden class=anchor aria-hidden=true href=#how-complex-are-the-pdfs>#</a></h2><p>There&rsquo;s a wide variety of documents out there. If the PDFs you&rsquo;re working with can be converted to text accurately, the AI models have a good chance of being able to extract the data you&rsquo;re after. If the conversion to text is insufficiently accurate, the AIs stand little chance of outperforming data entry specialists – they just don&rsquo;t see the PDFs as well as we do.</p><p>It&rsquo;s worth spending some time on this question. For example, if you&rsquo;re using OpenAI&rsquo;s APIs and much of the data you&rsquo;re looking to extract is contained in tables within PDFs, you can first test whether you can reliably retrieve and display specific tables. Sticking with the example of financial statements, there&rsquo;s a big difference between <a href=https://www.grantthornton.global/contentassets/5bd3489f6516406d883a7300da904e96/ifrs-example-financial-statements-2022_feb-2023.pdf target=_blank rel=noopener>this 124-page sample from Grant Thornton</a> and <a href=https://regorgs.fwc.gov.au/sites/default/files/migration/429/fs023-sample-financial-statements.pdf target=_blank rel=noopener>this seven-page sample from the Australian Fair Work Commission</a>. Prompting GPT-4 (via ChatGPT Plus) to extract a table from the former PDF produced output that was only loosely-connected to the actual content. By contrast, GPT-4 perfectly reproduced tables from the latter PDF.</p><p>In general, perfect conversion of PDF tables to text appears to be an unsolved problem. For example, in <a href=https://arxiv.org/pdf/2303.09957.pdf target=_blank rel=noopener>a benchmark from last year</a>, the best tool tested only had about 50% accuracy when applied to tables in scientific papers. However, the field keeps moving. You may get an accuracy boost by treating PDFs as images and using vision models, as <a href=https://unstructured-io.github.io/unstructured/best_practices/table_extraction_pdf.html target=_blank rel=noopener>recommended by the Unstructured library</a>. Anecdotally, I found Unstructured&rsquo;s table parsing to be too inaccurate, but when I used GPT-4 Vision on screenshots of the same tables it yielded much better results. It also outperformed OpenAI&rsquo;s default PDF parser. Your mileage will definitely vary.</p><p>Things get even more complicated if some of the data you&rsquo;re hoping to extract is in graphs and other figures contained in the PDFs. Verifying that the PDFs are not too complex <em>for the AI models</em> is definitely worth doing before jumping into more elaborate data extraction tasks.</p><h2 id=can-the-tokenised-pdfs-fully-fit-in-the-models-context-window>Can the tokenised PDFs fully fit in the model&rsquo;s context window?<a hidden class=anchor aria-hidden=true href=#can-the-tokenised-pdfs-fully-fit-in-the-models-context-window>#</a></h2><p>The complexity of the PDFs determines <em>how well</em> the AI models can see them. Their length determines <em>how much</em> of them the models can see at once. There&rsquo;s nuance depending on whether the PDFs are fully converted to text <a href=https://platform.openai.com/tokenizer target=_blank rel=noopener>tokens</a> or to text and images, but the number of tokens that can be fed in (i.e., the context window) is limited with the current generation of AI models. Context windows are rapidly expanding, with <a href=https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ target=_blank rel=noopener>Google recently releasing a million-token model</a> (approximately 700,000 words), but there&rsquo;s still a cost per token that you need to consider when building solutions.</p><p>If you&rsquo;re working with PDFs that are larger than your context windows, you&rsquo;re probably going to use retrieval-augmented generation, i.e., break the documents into chunks and feed specific chunks into the model based on the query. With some tools and APIs, this may be handled for you (e.g., with OpenAI&rsquo;s Assistants). However, it introduces a source of inaccuracies that may not be acceptable for your use case.</p><h2 id=what-is-your-teaching-approach>What is your teaching approach?<a hidden class=anchor aria-hidden=true href=#what-is-your-teaching-approach>#</a></h2><p>Assuming you&rsquo;re satisfied that the AIs can see enough of the PDFs well enough to provide useful answers, it&rsquo;s time to test different ways of teaching them about the data extraction tasks. The question of which approaches you can test is closely tied to your budget. But even with large budgets, it&rsquo;s best to start simple and only attempt more complicated approaches if the simpler ones fail. In order of implementation complexity, key teaching approaches are:</p><ul><li><strong>Zero-shot:</strong> Describe the task and the expected output, provide a new PDF, and see if you get the expected output.</li><li><strong>Few-shot:</strong> In addition to describing the task and expected output, also provide examples of past PDFs and their extracted outputs. Given context window limitations, this is only feasible with relatively short PDFs and simple outputs.</li><li><strong>Fine-tuning:</strong> This goes beyond the sort of prompting that has become accessible to the general population via ChatGPT. The general idea is that you can get better results by teaching the underlying model about your expected inputs and outputs. Even if you don&rsquo;t have machine learning experts on your team, you may get good results by following resources such as <a href=https://platform.openai.com/docs/guides/fine-tuning/ target=_blank rel=noopener>the fine-tuning guide by OpenAI</a>. However, success isn&rsquo;t guaranteed, so it&rsquo;s important to manage expectations and budgets accordingly. It may well be the case that you&rsquo;re better off waiting a few months or years for new AI models, rather than investing in fine-tuning experimentation. New models are likely to make lower-effort zero/few-shot results better.</li><li><strong>Custom models:</strong> Taking a step beyond fine-tuning, building custom machine learning models may be a good match for your budget and available expertise. However, you definitely won&rsquo;t be doing it to automate a low-cost-low-frequency data entry process.</li></ul><p>Implicit in the above is the availability of some training & testing data (i.e., input PDFs and expected outputs). That is, no matter what approach you follow, you&rsquo;d want to have some confidence that it works beyond <a href=https://jxnl.github.io/blog/writing/2024/02/05/when-to-lgtm-at-k/ target=_blank rel=noopener>a few test samples</a> – use a large representative dataset to gain confidence in your solution.</p><h2 id=can-the-ai-model-understand-the-input-structures>Can the AI model understand the input structures?<a hidden class=anchor aria-hidden=true href=#can-the-ai-model-understand-the-input-structures>#</a></h2><p>This is closely related to the question of PDF complexity, but worth considering separately. I&rsquo;m anthropomorphising a bit by talking about AI <em>understanding</em>, but just as they can&rsquo;t see the same as we do, their level of understanding may also be unintuitive. For example, <a href=https://arxiv.org/pdf/2310.09263.pdf target=_blank rel=noopener>a recent paper</a> that proposed a fine-tuning approach to improve GPT-3.5&rsquo;s table understanding made the case that general language models can&rsquo;t read tables reliably because:</p><blockquote><p>Natural language texts are (1) one-directional, (2) read left-to-right, where (3) swapping two tokens will generally change the meaning of a sentence. In contrast, relational tables are (1) two-dimensional in nature with both rows and columns, (2) where reading top-to-bottom in the vertical direction for values in the same column, is crucial in many table-tasks. Furthermore, unlike text, (3) tables are largely &ldquo;invariant&rdquo; to row and column permutations, where swapping two rows or columns do not generally change the semantic meaning of the table.</p></blockquote><p>This argument is compelling, but given <a href="https://openreview.net/pdf?id=yzkSU5zdwD" target=_blank rel=noopener>the emergent abilities of large language models</a> and the fact that we no longer know what goes into proprietary models beyond GPT-3.5, I wouldn&rsquo;t bet on these limitations being an issue for <em>all</em> tabular data. Again, experimenting with your specific use case is key. If you encounter issues, it&rsquo;s worth probing the models to check if they exhibit any semantic understanding beyond just reproducing the inputs.</p><h2 id=can-the-ai-model-understand-and-produce-the-output-structures>Can the AI model understand and produce the output structures?<a hidden class=anchor aria-hidden=true href=#can-the-ai-model-understand-and-produce-the-output-structures>#</a></h2><p>If you&rsquo;re building custom models, it&rsquo;s straightforward to get exactly the output structure you want (e.g., a complex JSON). Otherwise, if you&rsquo;re prompting a language model, you need to ask <a href=https://arxiv.org/pdf/2402.14531.pdf target=_blank rel=noopener>nicely</a> and hope for the best. That said, there are strategies to get models to produce the output structures you want, such as using <a href=https://platform.openai.com/docs/guides/function-calling target=_blank rel=noopener>OpenAI&rsquo;s function calling</a> or <a href=https://github.com/outlines-dev/outlines target=_blank rel=noopener>the Outlines library</a>. However, as with the example of tabular inputs, there is a difference between being able to produce an output that conforms to a specific output schema and populating the schema with values that make semantic sense. Breaking down complex outputs to simpler structures and using <a href=https://www.promptingguide.ai/techniques/prompt_chaining target=_blank rel=noopener>prompt chaining</a> may be helpful in some cases.</p><h2 id=what-is-your-long-term-validation-approach>What is your long-term validation approach?<a hidden class=anchor aria-hidden=true href=#what-is-your-long-term-validation-approach>#</a></h2><p>Assuming you successfully build an AI solution that can replace manual data entry, should you completely stop manual data extraction? As with other questions, it depends on the use case, but it&rsquo;s worth considering a gradual switch to full automation. For example, you can keep the manual process for 10% of new data to verify that the whole system works as expected. This is especially worth doing when working with publicly-available datasets, as there&rsquo;s a non-zero chance that the models you&rsquo;re using have seen the input training data before (though they probably haven&rsquo;t seen your outputs).</p><h2 id=anything-else>Anything else?<a hidden class=anchor aria-hidden=true href=#anything-else>#</a></h2><p>If I missed important questions, please let me know and I will update this post.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on x" href="https://x.com/intent/tweet/?text=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f&amp;hashtags=artificialintelligence%2cdatascience%2cmachinelearning%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f&amp;title=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction&amp;summary=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f&title=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on whatsapp" href="https://api.whatsapp.com/send?text=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on telegram" href="https://telegram.me/share/url?text=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Questions to consider when using AI for PDF data extraction on ycombinator" href="https://news.ycombinator.com/submitlink?t=Questions%20to%20consider%20when%20using%20AI%20for%20PDF%20data%20extraction&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f03%2f11%2fquestions-to-consider-when-using-ai-for-pdf-data-extraction%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/index.html b/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/index.html
index bc6d902ce..078b3b153 100644
--- a/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/index.html
+++ b/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/lord-howe-island-algal-hole-trevally_hub04ed3d37235d405b5867447ff565938_331694_1500x0_resize_q75_h2_box_2.webp 1500w," src=https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/lord-howe-island-algal-hole-trevally_hub04ed3d37235d405b5867447ff565938_331694_800x0_resize_q75_h2_box_2.webp alt="Automation or no automation, hanging out with the fish is always fun" loading=lazy></a><figcaption><p>Automation or no automation, hanging out with the fish is always fun</p></figcaption></figure></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on x" href="https://x.com/intent/tweet/?text=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f&amp;hashtags=artificialintelligence%2cmachinelearning%2cmarinescience%2cReefLifeSurvey"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f&amp;title=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish&amp;summary=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f&title=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on whatsapp" href="https://api.whatsapp.com/send?text=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on telegram" href="https://telegram.me/share/url?text=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence, automation, and the art of counting fish on ycombinator" href="https://news.ycombinator.com/submitlink?t=Artificial%20intelligence%2c%20automation%2c%20and%20the%20art%20of%20counting%20fish&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f01%2fartificial-intelligence-automation-and-the-art-of-counting-fish%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/index.html b/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/index.html
index 3e7272dfe..766a3b645 100644
--- a/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/index.html
+++ b/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/index.html
@@ -7,7 +7,7 @@
 https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/work-on-climate-six-month-milestone-award.webp 1200w," src=https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/work-on-climate-six-month-milestone-award_hu8ed3990140315acddae2a6c763f03c66_143390_800x0_resize_q75_h2_box_2.webp alt="Six-month milestone award from Work on Climate, recognising my contribution to the organisation." loading=lazy></a><figcaption><p>This milestone award is another example of Work on Climate&rsquo;s similarity to full-time work environments. It&rsquo;s nice to be recognised!</p></figcaption></figure></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/climate-change/>Climate Change</a></li><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/remote-work/>Remote Work</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on x" href="https://x.com/intent/tweet/?text=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f&amp;hashtags=career%2cclimatechange%2cdataengineering%2cdatascience%2cdatastrategy%2cenvironment%2cpersonal%2cremotework%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f&amp;title=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate&amp;summary=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f&title=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on whatsapp" href="https://api.whatsapp.com/send?text=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on telegram" href="https://telegram.me/share/url?text=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share My experience as a Data Tech Lead with Work on Climate on ycombinator" href="https://news.ycombinator.com/submitlink?t=My%20experience%20as%20a%20Data%20Tech%20Lead%20with%20Work%20on%20Climate&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f08%2fmy-experience-as-a-data-tech-lead-with-work-on-climate%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/index.html b/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/index.html
index e985e7209..5c6914c61 100644
--- a/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/index.html
+++ b/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,machine learning,software engineering"><meta name=description content="It&rsquo;s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="AI does not obviate the need for testing and observability"><meta property="og:description" content="It&rsquo;s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/"><meta property="og:image" content="https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-04-15T05:00:00+00:00"><meta property="article:modified_time" content="2024-04-15T15:54:17+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover.webp"><meta name=twitter:title content="AI does not obviate the need for testing and observability"><meta name=twitter:description content="It&rsquo;s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"AI does not obviate the need for testing and observability","item":"https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"AI does not obviate the need for testing and observability","name":"AI does not obviate the need for testing and observability","description":"It\u0026rsquo;s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software.","keywords":["artificial intelligence","machine learning","software engineering"],"articleBody":"The excitement sparked by ChatGPT has led to a flood of funding for building AI applications, especially around large language models (LLMs). The ease of getting started with AI can lead to excessive enthusiasm, to the point of believing that we have entered a new regime of software development where old best practices no longer apply. The goal of this post is to demonstrate that we are still in the old regime: Testing and observability remain key to AI success beyond initial prototypes.\nBookmark and reuse if anyone tries to claim otherwise.\nFirst, let’s acknowledge the fact that prototyping AI applications is now easier than ever. For example, I recently watched this video by Hrishi Olickel, which demonstrates how to go from zero to a working AI-powered app in about thirty minutes. Examples like this abound, but I have a feeling that people might miss two key messages from the video:\n99% of the time, the problem is with your data. The app isn’t ready for production. Two elements that solid production-level apps include are testing and observability. This is highlighted in recent posts by two consultants who are helping companies ship LLM-powered applications:\nYour AI Product Needs Evals by Hamel Husain. Key quote: “Unsuccessful products almost always share a common root cause: a failure to create robust evaluation systems.” Levels of Complexity: RAG Applications by Jason Liu. Level 3 is observability. Level 4 is evaluations. The use of the word evaluations (or evals) by both authors is intentional. This is the common term for testing that deals with the challenges of working with LLMs (essentially a complex mapping from any text input to any text output). As noted in the OpenAI Evals repository:\nIf you are building with LLMs, creating high quality evals is one of the most impactful things you can do. Without evals, it can be very difficult and time intensive to understand how different model versions might affect your use case.\nThat is, we are at the opposite to a new regime where traditional software testing can be forgotten: Production-level AI apps still require all the usual software tests, as well as AI-specific evaluations.\nIn a way, this is nothing new. Before ChatGPT drew significant attention to LLMs, much of the buzz was around traditional machine learning (ML) apps. And many of the best practices from ML engineering apply to LLM / AI engineering.\nIf you are inexperienced with shipping production-grade AI/ML/LLM applications, please don’t let it stop you from prototyping. But if you are getting serious about going beyond a prototype, it’s time to either get help from experienced AI engineers, or to become one yourself (experience is a great teacher). Just remember that there is no way around testing and observability if you want to ship a quality product.\n","wordCount":"465","inLanguage":"en","image":"https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover.webp","datePublished":"2024-04-15T05:00:00Z","dateModified":"2024-04-15T15:54:17+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">AI does not obviate the need for testing and observability</h1><div class=post-meta><span title='2024-04-15 05:00:00 +0000 UTC'>April 15, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover_hu086bc2bb87f4a14886c80fee86cdc272_99778_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover_hu086bc2bb87f4a14886c80fee86cdc272_99778_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover_hu086bc2bb87f4a14886c80fee86cdc272_99778_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover_hu086bc2bb87f4a14886c80fee86cdc272_99778_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/cover.webp alt="clunky untested bot on the left, slicker bot on the right" width=1200 height=630></figure><div class=post-content><p>The excitement sparked by ChatGPT has led to a flood of funding for building AI applications, especially around large language models (LLMs). The ease of getting started with AI can lead to excessive enthusiasm, to the point of believing that we have entered a new regime of software development where old best practices no longer apply. The goal of this post is to demonstrate that we are still in the old regime: <strong>Testing and observability remain key to AI success beyond initial prototypes.</strong></p><p>Bookmark and reuse if anyone tries to claim otherwise.</p><p>First, let&rsquo;s acknowledge the fact that <strong>prototyping AI applications is now easier than ever.</strong> For example, I recently watched <a href="https://www.youtube.com/watch?v=8w0hUcQSDy8" target=_blank rel=noopener>this video by Hrishi Olickel</a>, which demonstrates how to go from zero to a working AI-powered app in about thirty minutes. Examples like this abound, but I have a feeling that people might miss two key messages from the video:</p><ol><li>99% of the time, the problem is with your data.</li><li>The app isn&rsquo;t ready for production.</li></ol><p><strong>Two elements that solid production-level apps include are testing and observability.</strong> This is highlighted in recent posts by two consultants who are helping companies ship LLM-powered applications:</p><ol><li><a href=https://hamel.dev/blog/posts/evals/ target=_blank rel=noopener>Your AI Product Needs Evals</a> by Hamel Husain. Key quote: <em>&ldquo;Unsuccessful products almost always share a common root cause: a failure to create robust evaluation systems.&rdquo;</em></li><li><a href=https://jxnl.github.io/blog/writing/2024/02/28/levels-of-complexity-rag-applications/ target=_blank rel=noopener>Levels of Complexity: RAG Applications</a> by Jason Liu. Level 3 is observability. Level 4 is evaluations.</li></ol><p>The use of the word <em>evaluations</em> (or <em>evals</em>) by both authors is intentional. This is the common term for testing that deals with the challenges of working with LLMs (essentially a complex mapping from any text input to any text output). As noted in <a href=https://github.com/openai/evals target=_blank rel=noopener>the OpenAI Evals repository</a>:</p><blockquote><p>If you are building with LLMs, creating high quality evals is one of the most impactful things you can do. Without evals, it can be very difficult and time intensive to understand how different model versions might affect your use case.</p></blockquote><p>That is, we are at the opposite to a new regime where traditional software testing can be forgotten: <strong>Production-level AI apps still require all the usual software tests, <em>as well as</em> AI-specific evaluations.</strong></p><p>In a way, this is nothing new. Before ChatGPT drew significant attention to LLMs, much of the buzz was around traditional machine learning (ML) apps. And <a href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/>many of the best practices from ML engineering apply to LLM / AI engineering</a>.</p><p>If you are inexperienced with shipping production-grade AI/ML/LLM applications, please don&rsquo;t let it stop you from prototyping. But if you are getting serious about going beyond a prototype, it&rsquo;s time to either get help from experienced AI engineers, or to become one yourself (experience is a great teacher). Just remember that <strong>there is no way around testing and observability if you want to ship a quality product.</strong></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on x" href="https://x.com/intent/tweet/?text=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f&amp;hashtags=artificialintelligence%2cmachinelearning%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f&amp;title=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability&amp;summary=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f&title=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on whatsapp" href="https://api.whatsapp.com/send?text=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on telegram" href="https://telegram.me/share/url?text=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI does not obviate the need for testing and observability on ycombinator" href="https://news.ycombinator.com/submitlink?t=AI%20does%20not%20obviate%20the%20need%20for%20testing%20and%20observability&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f15%2fai-does-not-obviate-the-need-for-testing-and-observability%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/04/22/assessing-a-startups-data-to-ai-health/index.html b/2024/04/22/assessing-a-startups-data-to-ai-health/index.html
index 23539539a..5e2d89663 100644
--- a/2024/04/22/assessing-a-startups-data-to-ai-health/index.html
+++ b/2024/04/22/assessing-a-startups-data-to-ai-health/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="analytics,artificial intelligence,business,data science,data strategy,machine learning,software engineering,startups"><meta name=description content="Reviewing the areas that should be assessed to determine a startup&rsquo;s opportunities and challenges on the data/AI/ML front."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Assessing a startup's data-to-AI health"><meta property="og:description" content="Reviewing the areas that should be assessed to determine a startup&rsquo;s opportunities and challenges on the data/AI/ML front."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/"><meta property="og:image" content="https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-04-22T06:00:00+00:00"><meta property="article:modified_time" content="2024-04-22T17:38:21+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover.webp"><meta name=twitter:title content="Assessing a startup's data-to-AI health"><meta name=twitter:description content="Reviewing the areas that should be assessed to determine a startup&rsquo;s opportunities and challenges on the data/AI/ML front."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Assessing a startup's data-to-AI health","item":"https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Assessing a startup's data-to-AI health","name":"Assessing a startup\u0027s data-to-AI health","description":"Reviewing the areas that should be assessed to determine a startup\u0026rsquo;s opportunities and challenges on the data/AI/ML front.","keywords":["analytics","artificial intelligence","business","data science","data strategy","machine learning","software engineering","startups"],"articleBody":"In the past year, I went from exploring product ideas to committing to my current consulting practice. One thing that became apparent was that I needed to get better at communicating my unique value proposition: who I serve, and how I can help them. The circus that is data (/AI/ML/BI/analytics/…) titles and terminology definitely doesn’t help. Sprinkle a couple of decades of hype cycles on top, and you end up where we are today: a mess of inflated expectations followed by disappointments. But also a wealth of opportunities to generate genuine value.\nAnyway, I’m now fairly clear on whom I’m actively targeting: funded Australian startups (around seed to series A) in the climate \u0026 nature tech space, who can use help on their data-to-AI journey. I started calling it data-to-AI rather than data \u0026 AI or data/AI/ML because anything AI (/ML/data science/…) or analytics starts with data – and keeps going back to data.\nHow I can help isn’t as clearly communicated as I’d like it to be, so I’ve been working on that. Offerings at the cheap and expensive ends of the spectrum are easy to explain: one-off advisory calls include bespoke on-the-spot advice, while fractional chief data \u0026 AI officer engagements include similar responsibilities to those of a full-timer with the same title. However, it’s in nobody’s best interest to jump straight into a fractional relationship. To address this, I’ve been working on a standard offering that’d be more structured than advisory calls, deliver value to the client, and allow both parties to uncover opportunities and see how we work together.\nMy working title for the offering is Data-to-AI Health Check (better suggestions welcome). The idea is to assess where the startup stands with their data/AI/ML stack and capabilities, and identify the top opportunities for improvement.\nThis has been on my mind for a while, so I’ve collected a heap of documents and questions for inspiration. I’m now at the “too overwhelmed” phase of turning it into something I can present, but hopefully I’ll have it all sorted in the coming weeks.\nIn the meantime (and in the spirit of building in public), the rest of this post describes the areas I think are most important to assess. Suggestions for areas I might have missed are welcome. In future posts, I’ll add more detail on performing the assessment, which will undoubtedly evolve as I offer it to more clients.\nAssessment areas Product and business model. Understanding what the startup is about and where it’s going is key to understanding where data/AI/ML fit in. One useful lens is determining whether the product is ML-centric or non-ML, with non-ML products varying in their data intensity from data-centric to data-supported. It’s also important to understand key metrics and how they’re measured.\nPeople. Who’s working for the company and what is the team structure? In particular, what are the current data/AI/ML capabilities and experience? Can the current staff deliver what the business needs? If there are skill gaps (e.g., they haven’t yet made their first data hire), what’s the plan to address them? Can the current team adequately assess the skills of data people?\nProcesses and project management. The best people will fail to deliver projects if the company’s processes have deep flaws. My general opinion is that all the best practices from software development can and should be applied to data projects (e.g., see posts from 2023 and 2018). However, data entropy and the probabilistic nature of AI/ML require extra care and practices in addition to traditional software development.\nCulture. Knowing what people are on the team and what processes are in place isn’t enough to assess how well the team can deliver the product vision. Culture – the unwritten norms and beliefs of the company – matters. A lot. For example, if the founder doesn’t tolerate data-backed evidence that contradicts their preconceived notions, it’s likely to be an impediment to data/AI/ML project delivery. Similarly, it’s worth paying attention to how experiments are treated: If a hypothesis behind an experiment turns out to be unsupported, it’s not a failure. Failing to learn from experiments is the true failure.\nData. What data is the company dealing with? What are the data’s volume, velocity, and variety? Is all the necessary data being captured? How clean is it? Where is it stored and how is it processed? What data management practices are in place, both explicitly and implicitly?\nTech. Closely related to data is the tech architecture, systems, and software. Tech includes where the data lives and how it flows, particularly how it feeds into AI/ML/analytics applications. Of particular interest is the allocation of innovation tokens. Innovation tokens should be spent on tech that makes the startup meaningfully unique to its customers. Everything else should be boring and standard, i.e., proven to work and fit for purpose.\nSecurity and compliance. Security is interwoven through all of the above. For example, you want a culture where any person can flag security risks – some of which may only be visible if you’re close to the code and data. Security breaches and data leaks can destroy companies, especially young startups that haven’t earned customer trust yet. Particular attention should be paid to compliance issues that arise with data collection, e.g., around personal and regulated data.\nOther opportunities and risks. In exploring the above areas, issues that don’t fit neatly into any bucket are likely to be uncovered. These may be new opportunities or risks. It’s important to keep an eye out for such cases and flag them accordingly.\nClosing thoughts In my experience, it’s easy to find a thousand areas for improvement once you become familiar with a startup or a large company division. It’s harder to identify the top three items to work on next – it is a bet on the highest-impact items that are feasible to deliver.\nIt is also a challenge to distill the Data-to-AI Health Check to a set of questions that would probe the right areas without burdening the startup too much. I’ll report back once I’ve figured it out. In the meantime, comments are welcome!\n","wordCount":"1016","inLanguage":"en","image":"https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover.webp","datePublished":"2024-04-22T06:00:00Z","dateModified":"2024-04-22T17:38:21+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Assessing a startup's data-to-AI health</h1><div class=post-meta><span title='2024-04-22 06:00:00 +0000 UTC'>April 22, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover_hub9e57ddfdfd17c3f98b514b724234f5e_62014_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover_hub9e57ddfdfd17c3f98b514b724234f5e_62014_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover_hub9e57ddfdfd17c3f98b514b724234f5e_62014_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover_hub9e57ddfdfd17c3f98b514b724234f5e_62014_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/cover.webp alt="a person dressed as a doctor conducting a health check on a screen" width=1200 height=630></figure><div class=post-content><p>In the past year, I went from exploring product ideas to committing to my current consulting practice. One thing that became apparent was that I needed to get better at communicating my unique value proposition: who I serve, and how I can help them. The circus that is data (/AI/ML/BI/analytics/&mldr;) titles and terminology definitely doesn&rsquo;t help. Sprinkle a couple of decades of hype cycles on top, and you end up where we are today: a mess of inflated expectations followed by disappointments. But also a wealth of opportunities to generate genuine value.</p><p>Anyway, I&rsquo;m now fairly clear on whom I&rsquo;m actively targeting: funded Australian startups (around seed to series A) in the climate & nature tech space, who can use help on their data-to-AI journey. I started calling it data-to-AI rather than data & AI or data/AI/ML because anything AI (/ML/data science/&mldr;) or analytics starts with data – and keeps going back to data.</p><p>How I can help isn&rsquo;t as clearly communicated as I&rsquo;d like it to be, so I&rsquo;ve been working on that. Offerings at the cheap and expensive ends of the spectrum are easy to explain: one-off advisory calls include bespoke on-the-spot advice, while fractional chief data & AI officer engagements include similar responsibilities to those of a full-timer with the same title. However, it&rsquo;s in nobody&rsquo;s best interest to jump straight into a fractional relationship. To address this, I&rsquo;ve been working on a standard offering that&rsquo;d be more structured than advisory calls, deliver value to the client, and allow both parties to uncover opportunities and see how we work together.</p><p>My working title for the offering is <em>Data-to-AI Health Check</em> (better suggestions welcome). The idea is to assess where the startup stands with their data/AI/ML stack and capabilities, and identify the top opportunities for improvement.</p><p>This has been on my mind for a while, so I&rsquo;ve collected a heap of documents and questions for inspiration. I&rsquo;m now at the &ldquo;too overwhelmed&rdquo; phase of turning it into something I can present, but hopefully I&rsquo;ll have it all sorted in the coming weeks.</p><p>In the meantime (and in the spirit of building in public), the rest of this post describes the areas I think are most important to assess. Suggestions for areas I might have missed are welcome. In future posts, I&rsquo;ll add more detail on performing the assessment, which will undoubtedly evolve as I offer it to more clients.</p><h2 id=assessment-areas>Assessment areas<a hidden class=anchor aria-hidden=true href=#assessment-areas>#</a></h2><p><strong>Product and business model.</strong> Understanding what the startup is about and where it&rsquo;s going is key to understanding where data/AI/ML fit in. One useful lens is determining whether the product is <a href=https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/>ML-centric or non-ML</a>, with non-ML products varying in their data intensity from data-centric to data-supported. It&rsquo;s also important to understand key metrics and how they&rsquo;re measured.</p><p><strong>People.</strong> Who&rsquo;s working for the company and what is the team structure? In particular, what are the current data/AI/ML capabilities and experience? Can the current staff deliver what the business needs? If there are skill gaps (e.g., they haven&rsquo;t yet made <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>their first data hire</a>), what&rsquo;s the plan to address them? Can the current team adequately assess the skills of data people?</p><p><strong>Processes and project management.</strong> The best people will fail to deliver projects if the company&rsquo;s processes have deep flaws. My general opinion is that all the best practices from software development can and should be applied to data projects (e.g., see posts from <a href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/>2023</a> and <a href=https://data.blog/2018/03/20/engineering-data-science-at-automattic/ target=_blank rel=noopener>2018</a>). However, data entropy and the probabilistic nature of AI/ML require extra care and practices <em>in addition to</em> traditional software development.</p><p><strong>Culture.</strong> Knowing what people are on the team and what processes are in place isn&rsquo;t enough to assess how well the team can deliver the product vision. Culture – the unwritten norms and beliefs of the company – matters. A lot. For example, if the founder doesn&rsquo;t tolerate data-backed evidence that contradicts their preconceived notions, it&rsquo;s likely to be an impediment to data/AI/ML project delivery. Similarly, it&rsquo;s worth paying attention to how experiments are treated: If a hypothesis behind an experiment turns out to be unsupported, it&rsquo;s not a failure. Failing to learn from experiments is the true failure.</p><p><strong>Data.</strong> What data is the company dealing with? What are the data&rsquo;s volume, velocity, and variety? Is all the necessary data being captured? How clean is it? Where is it stored and how is it processed? What <a href=https://en.wikipedia.org/wiki/Data_management target=_blank rel=noopener>data management</a> practices are in place, both explicitly and implicitly?</p><p><strong>Tech.</strong> Closely related to data is the tech architecture, systems, and software. Tech includes where the data lives and how it flows, particularly how it feeds into AI/ML/analytics applications. Of particular interest is <a href=https://boringtechnology.club/ target=_blank rel=noopener>the allocation of innovation tokens</a>. Innovation tokens should be spent on tech that makes the startup meaningfully unique to its customers. Everything else should be boring and standard, i.e., proven to work and fit for purpose.</p><p><strong>Security and compliance.</strong> Security is interwoven through all of the above. For example, you want a culture where any person can flag security risks – some of which may only be visible if you&rsquo;re close to the code and data. Security breaches and data leaks can destroy companies, especially young startups that haven&rsquo;t earned customer trust yet. Particular attention should be paid to compliance issues that arise with data collection, e.g., around personal and regulated data.</p><p><strong>Other opportunities and risks.</strong> In exploring the above areas, issues that don&rsquo;t fit neatly into any bucket are likely to be uncovered. These may be new opportunities or risks. It&rsquo;s important to keep an eye out for such cases and flag them accordingly.</p><h2 id=closing-thoughts>Closing thoughts<a hidden class=anchor aria-hidden=true href=#closing-thoughts>#</a></h2><p>In my experience, it&rsquo;s easy to find a thousand areas for improvement once you become familiar with a startup or a large company division. It&rsquo;s harder to identify the top three items to work on next – it is a bet on the highest-impact items that are feasible to deliver.</p><p>It is also a challenge to distill the Data-to-AI Health Check to a set of questions that would probe the right areas without burdening the startup too much. I&rsquo;ll report back once I&rsquo;ve figured it out. In the meantime, comments are welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on x" href="https://x.com/intent/tweet/?text=Assessing%20a%20startup%27s%20data-to-AI%20health&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f&amp;hashtags=analytics%2cartificialintelligence%2cbusiness%2cdatascience%2cdatastrategy%2cmachinelearning%2csoftwareengineering%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f&amp;title=Assessing%20a%20startup%27s%20data-to-AI%20health&amp;summary=Assessing%20a%20startup%27s%20data-to-AI%20health&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f&title=Assessing%20a%20startup%27s%20data-to-AI%20health"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on whatsapp" href="https://api.whatsapp.com/send?text=Assessing%20a%20startup%27s%20data-to-AI%20health%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on telegram" href="https://telegram.me/share/url?text=Assessing%20a%20startup%27s%20data-to-AI%20health&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Assessing a startup's data-to-AI health on ycombinator" href="https://news.ycombinator.com/submitlink?t=Assessing%20a%20startup%27s%20data-to-AI%20health&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f22%2fassessing-a-startups-data-to-ai-health%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/04/29/mentorship-and-the-art-of-actionable-advice/index.html b/2024/04/29/mentorship-and-the-art-of-actionable-advice/index.html
index 6cd5d758f..a94afe1f1 100644
--- a/2024/04/29/mentorship-and-the-art-of-actionable-advice/index.html
+++ b/2024/04/29/mentorship-and-the-art-of-actionable-advice/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,consulting,personal,startups"><meta name=description content="Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Mentorship and the art of actionable advice"><meta property="og:description" content="Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/"><meta property="og:image" content="https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-04-29T06:30:00+00:00"><meta property="article:modified_time" content="2024-04-29T17:25:28+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship.webp"><meta name=twitter:title content="Mentorship and the art of actionable advice"><meta name=twitter:description content="Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Mentorship and the art of actionable advice","item":"https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Mentorship and the art of actionable advice","name":"Mentorship and the art of actionable advice","description":"Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships.","keywords":["business","career","consulting","personal","startups"],"articleBody":"One of my challenges with the transition to solo consulting is learning to deliver timely, actionable advice. It’s usually easy for me to identify many areas for improvement. Distilling a long list of “obvious” opportunities to the top items that would make a difference is harder. And the hardest thing is packaging it all up as timely advice that people can act on.\nTo help address this challenge, I recently joined EnergyLab and GrowthMentor as a mentor. EnergyLab is Australia’s largest climate tech startup accelerator, while GrowthMentor is an international platform for mentorship around startup growth. Both are relevant to my focus on helping leaders of climate/nature tech startups ship data-intensive solutions (including AI/ML, data science, and advanced analytics – this stuff goes by many confusing and hyped-up names).\nThe rest of this post presents some of my reflections on packaging advice and expertise. I’m always happy to discuss these topics and connect directly with people I may be able to help, so please feel free to reach out with feedback.\nActionable and timely advice by example We all know we should get enough sleep. But telling a busy insomniac with young children to “sleep more” isn’t actionable. It’s more helpful to provide them with specific strategies for improving their sleep hygiene, like keeping screens out of their bed. And the more specific, the better: “At 10pm, put your phone to charge in another room” gives them exactly one thing they can do tonight.\nThere is more to it, though. If your insomniac friend comes to you complaining about having a bad night, they’re probably not expecting advice on where to charge their phone – at least not in that specific moment. The timing of the advice can make all the difference between them following it and them doing nothing (or worse – getting annoyed by your lack of empathy).\nThe same goes for advising anyone about anything.\nIn my case, advice of a general nature like “your data should be clean, relevant, and plentiful” is nice – but it’s also kinda useless. Getting more specific on strategies and tools is better, e.g, “consider dbt to manage and test data transformations”. Getting to the root of what they want to achieve may yield completely different advice, like “don’t worry about dbt for now, if you want that ML project you mentioned to succeed, you need to instrument and start collecting data around feature X of your product as soon as possible.”\nOn listening and packaging expertise To get to a point of giving timely, actionable advice, you need more than functional expertise. It’s important to listen to what the other person is saying (and not saying), and figuring out what they’re most likely to respond to. This is easier with people with whom you’ve already built a relationship than with new acquaintances – which makes the challenge of mentoring at scale all the more interesting.\nOne key aspect is aligning on expectations. Coming at it from the mentor side, I aim to be transparent about where I can and cannot help, so as to only attract mentees who are likely to be a good fit. However, after almost twenty years in the tech industry and over a decade in data / AI / engineering roles with startups \u0026 scaleups, it’s hard to succinctly describe my area of expertise. For example, I liked the label data scientist when it became popular around 2012, but both the label and I have changed over the years. There are major differences between my experience and that of a new data scientist who is fresh out of university. Me using a commodity label like data scientist is not in anyone’s best interest.\nAligning on expectations is easier in close long-term relationships. In our professional lives, such relationships are commonly formed when working for one employer at a time. Indeed, most of my work experience was that of an employee. And like many employees with long-term roles, it was easy for me to identify opportunities for improvement and provide actionable advice to my colleagues. There is a lot of implicit listening going on when you are dedicated to a single employer!\nIn the absence of a long-term relationship, it’s important to communicate expectations ahead of time. For example, this is what I put in as my “support offered” for EnergyLab founders:\nAdvice on data strategy, data hiring, AI/ML projects, data science, advanced analytics, and data-intensive solutions.\nI have over a decade of experience in data / AI / engineering roles with Australian startups (most famous: Car Next Door / Uber Carshare \u0026 Orkestra), international scaleups (Automattic / WordPress.com), and big tech (Intel / Qualcomm / Google). This means I also have many opinions on tech and startups beyond my specific expertise, which may be of use to some founders. :)\nIn a case of a potential fit, the next step on my end is to listen. My aim is to only offer mentorship in situations where I add value. Redirecting founders to others in my network who may be a better fit than me is a better outcome than attempting to give advice on topics that fall outside my area of expertise.\nTrue experts are always learning Another key aspect of providing advice as a mentor/expert is recognising that no one knows everything. Even within narrow areas of Data \u0026 AI, things are moving so fast that even the most knowledgeable people have no chance of keeping up.\nHowever, expertise is a relative term. I know more about shipping data-intensive solutions than a non-technical CEO, so I can probably help them (especially if they don’t have in-house data experts). I know less about PyTorch internals than an ML engineer who has been focused solely on deep learning for the past decade, so I’ll defer to such experts when deep PyTorch expertise is needed.\nAs another analogy, consider a general practice doctor named Amy – she is a medical expert in comparison to most of the population. But Amy wouldn’t try to perform brain surgery – she’ll refer you to a neurosurgeon (Barbara?), who is an expert in comparison to Amy.\nThings are fuzzier in the unregulated software and data worlds. Memorably, the young child of a past manager one day announced: “My computer has data on it! I am a data scientist!” The equivalent of such pronouncements in the adult world was the swift shift of LinkedIn titles in the years after 2012 – peak data science hype. By contrast, declaring yourself a medical doctor will land you in prison in many countries.\nIn the absence of regulated data expertise (which is probably undesirable), we are left with heuristics for determining who should be providing data advice. One of my favourite heuristics aligns with GrowthMentor’s core value of humility. In their words: “Nobody knows everything and we should all be open to hearing a different perspective on what we are working on. […] Opening yourself up to feedback from your peers will not only make you a stronger person, but also lead to more confidence in your professional life.”\nTo me, this is the sign of a true expert: Knowing that you still have a lot to learn. And this brings me back to what I’m aiming to learn and improve through mentorship: Giving timely, actionable advice outside the context of employee-employer relationships.\nI’ll report back on how it goes in the future.\n","wordCount":"1237","inLanguage":"en","image":"https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship.webp","datePublished":"2024-04-29T06:30:00Z","dateModified":"2024-04-29T17:25:28+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Mentorship and the art of actionable advice</h1><div class=post-meta><span title='2024-04-29 06:30:00 +0000 UTC'>April 29, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship_hu46c46c47614881b3f08d915679890e04_69694_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship_hu46c46c47614881b3f08d915679890e04_69694_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship_hu46c46c47614881b3f08d915679890e04_69694_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship_hu46c46c47614881b3f08d915679890e04_69694_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/robot-mentorship.webp alt="ChatGPT's depiction of a robot mentoring a robot" width=1200 height=630></figure><div class=post-content><p>One of my challenges with the transition to solo consulting is learning to deliver timely, actionable advice. It&rsquo;s usually easy for me to identify many areas for improvement. Distilling a long list of &ldquo;obvious&rdquo; opportunities to the top items that would make a difference is harder. And the hardest thing is packaging it all up as timely advice that people can act on.</p><p>To help address this challenge, I recently joined <a href=https://energylab.org.au/ target=_blank rel=noopener>EnergyLab</a> and <a href="https://app.growthmentor.com/?ref=396dd44db3" target=_blank rel=noopener>GrowthMentor</a> as a mentor. EnergyLab is Australia&rsquo;s largest climate tech startup accelerator, while GrowthMentor is an international platform for mentorship around startup growth. Both are relevant to my focus on helping leaders of climate/nature tech startups ship data-intensive solutions (including AI/ML, data science, and advanced analytics – this stuff goes by many confusing and hyped-up names).</p><p>The rest of this post presents some of my reflections on packaging advice and expertise. I&rsquo;m always happy to discuss these topics and connect directly with people I may be able to help, so please feel free to reach out with feedback.</p><h2 id=actionable-and-timely-advice-by-example>Actionable and timely advice by example<a hidden class=anchor aria-hidden=true href=#actionable-and-timely-advice-by-example>#</a></h2><p>We all know we should get enough sleep. But telling a busy insomniac with young children to &ldquo;sleep more&rdquo; isn&rsquo;t actionable. It&rsquo;s more helpful to provide them with specific strategies for improving their sleep hygiene, like keeping screens out of their bed. And the more specific, the better: <em>&ldquo;At 10pm, put your phone to charge in another room&rdquo;</em> gives them exactly one thing they can do <em>tonight</em>.</p><p>There is more to it, though. If your insomniac friend comes to you complaining about having a bad night, they&rsquo;re probably not expecting advice on where to charge their phone – at least not in that specific moment. The timing of the advice can make all the difference between them following it and them doing nothing (or worse – getting annoyed by your lack of empathy).</p><p>The same goes for advising anyone about anything.</p><p>In my case, advice of a general nature like <em>&ldquo;your data should be clean, relevant, and plentiful&rdquo;</em> is nice – but it&rsquo;s also kinda useless. Getting more specific on strategies and tools is better, e.g, <em>&ldquo;consider dbt to manage and test data transformations&rdquo;</em>. Getting to the root of what they want to achieve may yield completely different advice, like <em>&ldquo;don&rsquo;t worry about dbt for now, if you want that ML project you mentioned to succeed, you need to instrument and start collecting data around feature X of your product as soon as possible.&rdquo;</em></p><h2 id=on-listening-and-packaging-expertise>On listening and packaging expertise<a hidden class=anchor aria-hidden=true href=#on-listening-and-packaging-expertise>#</a></h2><p>To get to a point of giving timely, actionable advice, you need more than functional expertise. It&rsquo;s important to <em>listen</em> to what the other person is saying (and not saying), and figuring out what they&rsquo;re most likely to respond to. This is easier with people with whom you&rsquo;ve already built a relationship than with new acquaintances – which makes the challenge of mentoring at scale all the more interesting.</p><p>One key aspect is aligning on expectations. Coming at it from the mentor side, I aim to be transparent about where I can and cannot help, so as to only attract mentees who are likely to be a good fit. However, after almost twenty years in the tech industry and over a decade in data / AI / engineering roles with startups & scaleups, it&rsquo;s hard to succinctly describe my area of expertise. For example, <a href=https://yanirseroussi.com/2014/10/23/what-is-data-science/>I liked the label <em>data scientist</em> when it became popular around 2012</a>, but both the label and I have changed over the years. There are major differences between my experience and that of a new data scientist who is fresh out of university. Me using <a href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/>a commodity label like <em>data scientist</em></a> is not in anyone&rsquo;s best interest.</p><p>Aligning on expectations is easier in close long-term relationships. In our professional lives, such relationships are commonly formed when working for one employer at a time. Indeed, most of <a href=https://www.linkedin.com/in/yanirseroussi/ target=_blank rel=noopener>my work experience was that of an employee</a>. And like many employees with long-term roles, it was easy for me to identify opportunities for improvement and provide actionable advice to my colleagues. There is a lot of implicit listening going on when you are dedicated to a single employer!</p><p>In the absence of a long-term relationship, it&rsquo;s important to communicate expectations ahead of time. For example, this is what I put in as my &ldquo;support offered&rdquo; for EnergyLab founders:</p><blockquote><p>Advice on data strategy, data hiring, AI/ML projects, data science, advanced analytics, and data-intensive solutions.</p><p>I have over a decade of experience in data / AI / engineering roles with Australian startups (most famous: Car Next Door / Uber Carshare & Orkestra), international scaleups (Automattic / WordPress.com), and big tech (Intel / Qualcomm / Google). This means I also have many opinions on tech and startups beyond my specific expertise, which may be of use to some founders. :)</p></blockquote><p>In a case of a potential fit, the next step on my end is to listen. My aim is to only offer mentorship in situations where I add value. Redirecting founders to others in my network who may be a better fit than me is a better outcome than attempting to give advice on topics that fall outside my area of expertise.</p><h2 id=true-experts-are-always-learning>True experts are always learning<a hidden class=anchor aria-hidden=true href=#true-experts-are-always-learning>#</a></h2><p>Another key aspect of providing advice as a mentor/expert is recognising that no one knows everything. Even within narrow areas of Data & AI, things are moving so fast that even the most knowledgeable people have no chance of keeping up.</p><p>However, expertise is a relative term. I know more about shipping data-intensive solutions than a non-technical CEO, so I can probably help them (especially if they don&rsquo;t have in-house data experts). I know less about PyTorch internals than an ML engineer who has been focused solely on deep learning for the past decade, so I&rsquo;ll defer to such experts when deep PyTorch expertise is needed.</p><p>As another analogy, consider a general practice doctor named Amy – she is a medical expert in comparison to most of the population. But Amy wouldn&rsquo;t try to perform brain surgery – she&rsquo;ll refer you to a neurosurgeon (Barbara?), who is an expert in comparison to Amy.</p><p>Things are fuzzier in the unregulated software and data worlds. Memorably, the young child of a past manager one day announced: <em>&ldquo;My computer has data on it! I am a data scientist!&rdquo;</em> The equivalent of such pronouncements in the adult world was the swift shift of LinkedIn titles in the years after 2012 – peak data science hype. By contrast, declaring yourself a medical doctor will land you in prison in many countries.</p><p>In the absence of regulated data expertise (which is probably undesirable), we are left with heuristics for determining who should be providing data advice. One of my favourite heuristics aligns with GrowthMentor&rsquo;s core value of <em>humility</em>. In their words: <em>&ldquo;Nobody knows everything and we should all be open to hearing a different perspective on what we are working on. [&mldr;] Opening yourself up to feedback from your peers will not only make you a stronger person, but also lead to more confidence in your professional life.&rdquo;</em></p><p>To me, this is the sign of a true expert: Knowing that you still have a lot to learn. And this brings me back to what I&rsquo;m aiming to learn and improve through mentorship: Giving timely, actionable advice outside the context of employee-employer relationships.</p><p>I&rsquo;ll report back on how it goes in the future.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/consulting/>Consulting</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on x" href="https://x.com/intent/tweet/?text=Mentorship%20and%20the%20art%20of%20actionable%20advice&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f&amp;hashtags=business%2ccareer%2cconsulting%2cpersonal%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f&amp;title=Mentorship%20and%20the%20art%20of%20actionable%20advice&amp;summary=Mentorship%20and%20the%20art%20of%20actionable%20advice&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f&title=Mentorship%20and%20the%20art%20of%20actionable%20advice"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on whatsapp" href="https://api.whatsapp.com/send?text=Mentorship%20and%20the%20art%20of%20actionable%20advice%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on telegram" href="https://telegram.me/share/url?text=Mentorship%20and%20the%20art%20of%20actionable%20advice&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Mentorship and the art of actionable advice on ycombinator" href="https://news.ycombinator.com/submitlink?t=Mentorship%20and%20the%20art%20of%20actionable%20advice&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f04%2f29%2fmentorship-and-the-art-of-actionable-advice%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/index.html b/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/index.html
index b046378c1..1f328bbd6 100644
--- a/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/index.html
+++ b/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,data strategy,startups"><meta name=description content="Fourteen questions that prospective employees should ask about a startup&rsquo;s business model and product, especially for data-focused roles."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Business questions to ask before taking a startup data role"><meta property="og:description" content="Fourteen questions that prospective employees should ask about a startup&rsquo;s business model and product, especially for data-focused roles."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/"><meta property="og:image" content="https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-05-06T04:30:00+00:00"><meta property="article:modified_time" content="2024-05-06T14:41:43+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark.webp"><meta name=twitter:title content="Business questions to ask before taking a startup data role"><meta name=twitter:description content="Fourteen questions that prospective employees should ask about a startup&rsquo;s business model and product, especially for data-focused roles."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Business questions to ask before taking a startup data role","item":"https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Business questions to ask before taking a startup data role","name":"Business questions to ask before taking a startup data role","description":"Fourteen questions that prospective employees should ask about a startup\u0026rsquo;s business model and product, especially for data-focused roles.","keywords":["business","career","data strategy","startups"],"articleBody":"If you join a startup as an early employee, you’re essentially an investor. But unlike capital investors, you can’t diversify your portfolio as a full-timer. You need assurances that your time investment is likely to yield a positive return. Ideally, this would be a better return than the return on any other use of your work time.\nGood startups guarantee a return for early employees by paying a competitive base salary. Typically, compensation also includes equity in the company, which vests over time. However, unlike equity in publicly-traded companies, there’s a good chance that your startup equity will be worthless.\nTo help you assess the value of your startup equity, this post presents questions from the Product \u0026 Business Model section of my Data-to-AI Health Check for Startups. In creating the health check, I realised I could have titled it “questions I should have asked past employers”, i.e., you get to learn from my mistakes! The health check has seven other areas, which I may cover in future posts.\nBefore proceeding, note these assumptions:\nYou don’t need to take a specific job offer urgently, i.e., you have multiple options or time to search for better options. If there’s urgency, you can take a suboptimal role, build your skills and savings, and aim to generate better options on your next search. You’re not seen as too junior to be asking deep questions. If you’re early in your career, consider if early-stage startups are for you: My view is that you may be better off working with an established company, where there are more structured mentorship opportunities. The startup isn’t proposing an unpaid equity-only position. If you’re expected to work full-time without a salary, you’re a founder. The startup is small enough for you to speak with the founders as part of the recruitment process (maybe \u003c100 employees). The founders aren’t refusing to answer your questions. If they are, move on. Investor-level product \u0026 business questions Most of the following questions are typically answered by a pitch deck, so asking the founders to take you through the pitch may be the most time-efficient way of getting answers. Depending on the stage of product development, you may also be able to gather some answers yourself from the company’s website.\nWhile these are investor-level questions, your assessment of the answers should be different from that of an investor. You don’t have dozens of other startups in your portfolio – only one. Better make sure it’s a good one.\nQ1: What is your company’s purpose? What problem are you solving and why? You’re looking for a plausible story and solution. It’s important to understand founder motivations, and assess whether you want to spend many of your waking hours bringing their vision to life.\nQ2: What does your product do? If the product is already live, a demo would be best to answer this. Otherwise, wireframes and other plans would be good enough. If the product isn’t live yet, watch out for unrealistic plans. You don’t want to work on a product that will never get released.\nQ3: What are the relevant market sizes (TAM/SAM/SOM)? TAM is total addressable market – the total market demand for the product, which helps assess the growth potential. SAM is serviceable addressable market – the market demand the product can plausibly fulfill, which helps assess revenue targets. SOM is serviceable obtainable market – the part of the market that the startup’s product can capture, which helps assess short-term growth potential.\nQ4: Where do the problem, market, and solution sit on Jason Cohen’s problem flowchart? While TAM/SAM/SOM are useful high-level metrics, the problem flowchart goes deeper into assessing the viability of a startup. A surprising number of startups skip such assessments, and fail as a result. You may regret joining startups that make such preventable blunders.\nQ5: What is on the product roadmap for the next 6-12-24 months? In startup-land, plans for 12-24 months are in the realm of wishful thinking, but it’s good to have an idea of the general direction. Knowing what’s on the roadmap for the next six months will help you assess whether you want to come on board.\nQ6: What is the business model, i.e., how do you make money? Together with the other questions, this will help you assess the viability of the business. You should be especially wary if the founders haven’t figured out how to generate revenue yet, which means they’ll have to raise money to keep paying you. If they’re not seeing healthy growth in other key metrics (e.g., number of active users), they’ll struggle to raise more funding.\nQ7: What is the competition? How’s your product differentiated in the eyes of customers? How hard is it for competitors to copy you? Founders should have solid knowledge of the competitive landscape, and be able to explain why customers choose their product over the competition – and why they’ll continue to do so. Steer clear of founders who exhibit a low understanding of customer wants and needs. The company’s value ultimately comes from making something people want.\nQ8: What are the key business metrics (definitions, values, and trajectories)? This is especially pertinent if you’re the type of data person who’s going to get deep into business metrics as part of your job (a data scientist/analyst, as opposed to a data/AI/ML engineer). But regardless of role, it’s important for you to know how the business is performing. You should be confident that startup executives are measuring the right things.\nQ9: Since the last raise, how has the company performed against its goals? This includes goals that are covered by the key business metrics, as well as product development milestones. Repeatedly failing to achieve self-imposed goals is often a red flag – the goals may be unrealistic, and the business may not be viable.\nQ10: How much runway is left before another raise is needed? This is critical for employees to know. For example, if there are only three months left before the startup runs out of money, you may be out of a job pretty quickly. Note that the question still applies if the startup is bootstrapped (i.e., self-funded or funded by revenue) – money needs to come from somewhere to cover your salary.\nData-to-AI product \u0026 business questions While the above questions should be asked by any early startup employee, you should also get answers for the following questions if you’re considering a data/AI/ML role. If you’re the first data hire, pay specific attention to answers that indicate that the startup isn’t ready for a data hire, or that you may have to wear hats you’re unwilling to wear. For example, if you’re passionate about advanced AI/ML modelling but there are gaps in data engineering and basic analytics, you’re likely to be the one doing the data work to address those gaps.\nQ11: What is the data intensity of the product on a scale of 1-5? High data intensity typically requires low-latency processing of large volumes of data with more than one database server. With high intensity, a slowdown in data processing would noticeably affect key business metrics. High data intensity means that solid data engineering skills are required for success – it’s important to ascertain that founders are aware of this requirement.\nQ12: Is advanced AI/ML core to the product? What if you used simple heuristics? One issue with AI/ML is the hype. AI is indeed transformative and exciting, but using AI isn’t always required for the product to succeed. In the words of Google’s first rule of ML: “Don’t be afraid to launch a product without machine learning”. As a data professional and an outsider, you are in a good position to assess whether advanced AI/ML has to be core to the product. The answer should only be yes if it would make a difference in the eyes of the customers. Using AI/ML too early is often a premature optimisation. You should assess whether the added complexity of dealing with MLOps is justified.\nQ13: Are you planning to increase data intensity or advanced AI/ML use? Why? This question is similar to the one about the product roadmap, but specific to data/AI/ML. Again, the Why is key – ensure that there’s a solid business case for increased data/AI/ML complexity. In a healthy startup, increased complexity is driven by customer need, not by excitement about shiny tech.\nQ14: Are any decisions routinely blocked or delayed by limited access to data? This question helps assess gaps in data collection and quality, as well as the company’s culture around the use of data. It should also help you understand what sort of work is likely to be needed, e.g., even if there are plans to use more advanced AI/ML, the reality of data gaps may mean that plenty of data engineering work is needed.\nFeedback welcome If you found the above questions helpful or if you have any other feedback, I’d love to hear from you. I’m planning to evolve my Data-to-AI Health Check over time and post more on the other areas you should ask about. Subscribing for updates is the best way to get notified when it happens.\n","wordCount":"1525","inLanguage":"en","image":"https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark.webp","datePublished":"2024-05-06T04:30:00Z","dateModified":"2024-05-06T14:41:43+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Business questions to ask before taking a startup data role</h1><div class=post-meta><span title='2024-05-06 04:30:00 +0000 UTC'>May 6, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark_hu379e0ee4dc47d3cccfc6c225f8b70eb3_8314_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark_hu379e0ee4dc47d3cccfc6c225f8b70eb3_8314_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark_hu379e0ee4dc47d3cccfc6c225f8b70eb3_8314_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark_hu379e0ee4dc47d3cccfc6c225f8b70eb3_8314_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/black-box-with-question-mark.webp alt="black box with a question mark" width=1200 height=630></figure><div class=post-content><p>If you join a startup as an early employee, you&rsquo;re essentially an investor. But unlike capital investors, you can&rsquo;t diversify your portfolio as a full-timer. You need assurances that your time investment is likely to yield a positive return. Ideally, this would be a better return than the return on any other use of your work time.</p><p>Good startups guarantee a return for early employees by paying a competitive base salary. Typically, compensation also includes equity in the company, which vests over time. However, unlike equity in publicly-traded companies, there&rsquo;s a good chance that your startup equity will be worthless.</p><p>To help you assess the value of your startup equity, this post presents questions from the Product & Business Model section of my <a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Data-to-AI Health Check for Startups</a>. In creating the health check, I realised I could have titled it <em>&ldquo;questions I should have asked past employers&rdquo;</em>, i.e., you get to learn from my mistakes! The health check has seven other areas, which I may cover in future posts.</p><p>Before proceeding, note these assumptions:</p><ol><li>You don&rsquo;t need to take a specific job offer urgently, i.e., you have multiple options or time to search for better options. If there&rsquo;s urgency, you can take a suboptimal role, build your skills and savings, and aim to generate better options on your next search.</li><li>You&rsquo;re not seen as too junior to be asking deep questions. If you&rsquo;re early in your career, consider if early-stage startups are for you: My view is that you may be better off working with an established company, where there are more structured mentorship opportunities.</li><li>The startup isn&rsquo;t proposing an unpaid equity-only position. If you&rsquo;re expected to work full-time without a salary, you&rsquo;re a founder.</li><li>The startup is small enough for you to speak with the founders as part of the recruitment process (maybe &lt;100 employees).</li><li>The founders aren&rsquo;t refusing to answer your questions. If they are, move on.</li></ol><h2 id=investor-level-product--business-questions>Investor-level product & business questions<a hidden class=anchor aria-hidden=true href=#investor-level-product--business-questions>#</a></h2><p>Most of the following questions are typically answered by <a href=https://pitchdeckcoach.com/sequoia-capital-pitch-deck target=_blank rel=noopener>a pitch deck</a>, so asking the founders to take you through the pitch may be the most time-efficient way of getting answers. Depending on the stage of product development, you may also be able to gather some answers yourself from the company&rsquo;s website.</p><p>While these are investor-level questions, your assessment of the answers should be different from that of an investor. You don&rsquo;t have dozens of other startups in your portfolio – only one. Better make sure it&rsquo;s a good one.</p><p><strong>Q1: What is your company&rsquo;s purpose? What problem are you solving and why?</strong> You&rsquo;re looking for a plausible story and solution. It&rsquo;s important to understand founder motivations, and assess whether you want to spend many of your waking hours bringing their vision to life.</p><p><strong>Q2: What does your product do?</strong> If the product is already live, a demo would be best to answer this. Otherwise, wireframes and other plans would be good enough. If the product isn&rsquo;t live yet, watch out for unrealistic plans. You don&rsquo;t want to work on a product that will never get released.</p><p><strong>Q3: What are the relevant market sizes (TAM/SAM/SOM)?</strong> TAM is <em>total addressable market</em> – the total market demand for the product, which helps assess the growth potential. SAM is <em>serviceable addressable market</em> – the market demand the product can plausibly fulfill, which helps assess revenue targets. SOM is <em>serviceable obtainable market</em> – the part of the market that the startup&rsquo;s product can capture, which helps assess short-term growth potential.</p><p><strong>Q4: Where do the problem, market, and solution sit on <a href=https://longform.asmartbear.com/problem/ target=_blank rel=noopener>Jason Cohen&rsquo;s problem flowchart</a>?</strong> While TAM/SAM/SOM are useful high-level metrics, the problem flowchart goes deeper into assessing the viability of a startup. A surprising number of startups skip such assessments, and fail as a result. You may regret joining startups that make such <a href=https://longform.asmartbear.com/avoid-blundering/ target=_blank rel=noopener>preventable blunders</a>.</p><p><strong>Q5: What is on the product roadmap for the next 6-12-24 months?</strong> In startup-land, plans for 12-24 months are in the realm of wishful thinking, but it&rsquo;s good to have an idea of the general direction. Knowing what&rsquo;s on the roadmap for the next six months will help you assess whether you want to come on board.</p><p><strong>Q6: What is the business model, i.e., how do you make money?</strong> Together with the other questions, this will help you assess the viability of the business. You should be especially wary if the founders haven&rsquo;t figured out how to generate revenue yet, which means they&rsquo;ll have to raise money to keep paying you. If they&rsquo;re not seeing healthy growth in other key metrics (e.g., number of active users), they&rsquo;ll struggle to raise more funding.</p><p><strong>Q7: What is the competition? How&rsquo;s your product differentiated in the eyes of customers? How hard is it for competitors to copy you?</strong> Founders should have solid knowledge of the competitive landscape, and be able to explain why customers choose their product over the competition – and why they&rsquo;ll continue to do so. Steer clear of founders who exhibit a low understanding of customer wants and needs. The company&rsquo;s value ultimately comes from making something people want.</p><p><strong>Q8: What are the key business metrics (definitions, values, and trajectories)?</strong> This is especially pertinent if you&rsquo;re the type of data person who&rsquo;s going to get deep into business metrics as part of your job (a data scientist/analyst, as opposed to a data/AI/ML engineer). But regardless of role, it&rsquo;s important for you to know how the business is performing. You should be confident that startup executives are measuring the right things.</p><p><strong>Q9: Since the last raise, how has the company performed against its goals?</strong> This includes goals that are covered by the key business metrics, as well as product development milestones. Repeatedly failing to achieve self-imposed goals is often a red flag – the goals may be unrealistic, and the business may not be viable.</p><p><strong>Q10: How much runway is left before another raise is needed?</strong> This is critical for employees to know. For example, if there are only three months left before the startup runs out of money, you may be out of a job pretty quickly. Note that the question still applies if the startup is bootstrapped (i.e., self-funded or funded by revenue) – money needs to come from somewhere to cover your salary.</p><h2 id=data-to-ai-product--business-questions>Data-to-AI product & business questions<a hidden class=anchor aria-hidden=true href=#data-to-ai-product--business-questions>#</a></h2><p>While the above questions should be asked by any early startup employee, you should also get answers for the following questions if you&rsquo;re considering a data/AI/ML role. If you&rsquo;re <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>the first data hire</a>, pay specific attention to answers that indicate that the startup isn&rsquo;t ready for a data hire, or that you may have to wear hats you&rsquo;re unwilling to wear. For example, if you&rsquo;re passionate about advanced AI/ML modelling but there are gaps in data engineering and basic analytics, you&rsquo;re likely to be the one doing the data work to address those gaps.</p><p><strong>Q11: What is the data intensity of the product on a scale of 1-5?</strong> High data intensity typically requires low-latency processing of large volumes of data with more than one database server. With high intensity, a slowdown in data processing would noticeably affect key business metrics. High data intensity means that solid data engineering skills are required for success – it&rsquo;s important to ascertain that founders are aware of this requirement.</p><p><strong>Q12: Is advanced AI/ML core to the product? What if you used simple heuristics?</strong> One issue with AI/ML is the hype. <a href=https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/>AI is indeed transformative and exciting</a>, but using AI isn&rsquo;t always required for the product to succeed. In the words of <a href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/>Google&rsquo;s first rule of ML</a>: <em>&ldquo;Don&rsquo;t be afraid to launch a product without machine learning&rdquo;</em>. As a data professional and an outsider, you are in a good position to assess whether advanced AI/ML <em>has</em> to be core to the product. The answer should only be yes if it would make a difference <em>in the eyes of the customers</em>. Using AI/ML too early is often a premature optimisation. You should assess whether the <a href=https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/>added complexity of dealing with MLOps</a> is justified.</p><p><strong>Q13: Are you planning to increase data intensity or advanced AI/ML use? Why?</strong> This question is similar to the one about the product roadmap, but specific to data/AI/ML. Again, the <em>Why</em> is key – ensure that there&rsquo;s a solid business case for increased data/AI/ML complexity. In a healthy startup, increased complexity is driven by customer need, not by excitement about shiny tech.</p><p><strong>Q14: Are any decisions routinely blocked or delayed by limited access to data?</strong> This question helps assess gaps in data collection and quality, as well as the company&rsquo;s culture around the use of data. It should also help you understand what sort of work is likely to be needed, e.g., even if there are plans to use more advanced AI/ML, the reality of data gaps may mean that plenty of data engineering work is needed.</p><h2 id=feedback-welcome>Feedback welcome<a hidden class=anchor aria-hidden=true href=#feedback-welcome>#</a></h2><p>If you found the above questions helpful or if you have any other feedback, I&rsquo;d love to hear from you. I&rsquo;m planning to evolve my Data-to-AI Health Check over time and post more on the other areas you should ask about. Subscribing for updates is the best way to get notified when it happens.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on x" href="https://x.com/intent/tweet/?text=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f&amp;hashtags=business%2ccareer%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f&amp;title=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role&amp;summary=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f&title=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on whatsapp" href="https://api.whatsapp.com/send?text=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on telegram" href="https://telegram.me/share/url?text=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Business questions to ask before taking a startup data role on ycombinator" href="https://news.ycombinator.com/submitlink?t=Business%20questions%20to%20ask%20before%20taking%20a%20startup%20data%20role&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f06%2fbusiness-questions-to-ask-before-taking-a-startup-data-role%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/index.html b/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/index.html
index 3f310604c..c84bd0354 100644
--- a/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/index.html
+++ b/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,data strategy,startups"><meta name=description content="Ten questions that prospective employees should ask about a startup&rsquo;s team, especially for data-centric roles."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Probing the People aspects of an early-stage startup"><meta property="og:description" content="Ten questions that prospective employees should ask about a startup&rsquo;s team, especially for data-centric roles."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/"><meta property="og:image" content="https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-05-13T02:00:00+00:00"><meta property="article:modified_time" content="2024-05-13T12:41:01+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions.webp"><meta name=twitter:title content="Probing the People aspects of an early-stage startup"><meta name=twitter:description content="Ten questions that prospective employees should ask about a startup&rsquo;s team, especially for data-centric roles."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Probing the People aspects of an early-stage startup","item":"https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Probing the People aspects of an early-stage startup","name":"Probing the People aspects of an early-stage startup","description":"Ten questions that prospective employees should ask about a startup\u0026rsquo;s team, especially for data-centric roles.","keywords":["business","career","data strategy","startups"],"articleBody":"A successful startup is fundamentally a group of people who execute well together. Building a viable product is key, but changing the product is easier than changing the founders. If you’re considering a role at an early-stage startup, you will be investing your best waking hours in the company. It’s important to assess the team before making such an investment.\nTo learn about the team, ask questions from the People section of my Data-to-AI Health Check for Startups. Some questions emphasise probing Data/AI/ML capabilities – an area that’s often misunderstood by non-specialists. However, this emphasis can be shifted to different functional areas as needed. Similarly to my previous post on scrutinising the Product \u0026 Business Model, the rest of this post lists my questions along with brief opinionated explanations.\nPeople questions Q1: Who are the founders? What are their skills and experience? Founders make or break a startup. It’s important to gain confidence that they have the skills and experience required to build the company, along with the mindset needed to keep learning and developing relevant skills. Founders who were previously successful are an especially positive sign – it indicates that they have the persistence and flexibility needed to build a business.\nQ2: What motivates the founders? How passionate are they about the startup’s problem space? My favourite founders are those who build a business based on their deep understanding of customer problems in an area they deeply care about. For example, I previously worked with Orkestra – a software-as-a-service startup that grew directly out of the founders’ experience as consultants. In Orkestra’s case, the founders had already spent years working together solving customer problems prior to founding the company. By contrast, some startups are founded by near-strangers just because the founders want to build something – a red flag.\nQ3: Have any founders left? Why and how? Startups can turn friends into foes. Foes holding a significant share of the company may lead to its destruction. But even in cases where founders leave on good terms without significant equity, their departure stories help understand founder personalities and the trajectory of the company. For example, if you’re considering a full-time position, and the story of how the remaining founders treated departing founders gives you pause, you’re better off working elsewhere.\nQ4: Who are the key employees? Early employees are almost as important to startup success as founders. In fact, I know of multiple cases where employees “became” founders even though they weren’t there from day one. For example, after my PhD I joined Giveable as a founding data scientist. As the first employee, I was in charge of building the backend for Giveable’s B2C gift recommendation web app. Due to market conditions, we pivoted to a B2B recommender-as-a-service offering – not what the original founder had envisioned. He decided to move on, and I was left with much more equity than originally planned, along with the rights to the code. While I could have kept going as a “founder”, I decided to use the codebase to continue building the same B2B product as part of a more established ecommerce startup.\nQ5: Have any key employees left (including involuntarily)? Why and how? With early employees being almost as important to startup success as founders, stories of their departures can be as informative as stories of founder departures. If you’re considering a startup job, these stories can tell you a lot about founder-employee dynamics, before you become an employee. If you’re especially thorough, you can even reach out to the former employees to get their side of the story. Positive signs include low employee turnover and founders who are comfortable with you speaking to their former employees.\nQ6: How committed are the founders and key employees (partly measured by work time spent on the startup)? Early on, it’s common for founders and employees to be involved on a part-time basis. This is fine, but if you’re going to commit a significant chunk of your time to the startup, you need to know who you’ll be working with. As an employee, you can usually ignore the big names that are listed as advisors on the startup’s website – their involvement is typically minimal. That said, both advisors and fractional contractors provide access to expertise and connections that may not be necessary on a full-time basis. In fact, fractional help is much better than premature hiring, which unnecessarily burns through funding. The main things to look at in an answer to the commitment question are: (1) transparency; and (2) that committed staff have the skills needed to achieve the next milestones.\nQ7: What hiring practices do you follow? How do you assess the skills of new experts (e.g., first data hire)? Given the importance of early employees, a loose hiring process is a cause for alarm. However, thoughtlessly borrowing hiring practices from the likes of Google is also problematic, as such processes are laughably hackable and tedious to everyone involved. Startups can and should move faster on hiring than established players: My favourite hiring processes include paid work on real problems after an initial low-cost filter. These are hard to scale, but there’s no need to scale hiring in the early days. Paying for work on real problems also helps address the challenge of assessing the skills of new experts – they are judged on real work output rather than on confidence, pedigree, and performance on convoluted tasks.\nQ8: Do you pay market rates? Startups that don’t pay market rates are best avoided. They’re unlikely to attract and retain quality employees. Founders of such startups may also fall victim to classic fallacies like the 1975 Mythical Man-Month, and make expensive mistakes like hiring two mediocre engineers in place of one excellent engineer. When it comes to software (and data) development, higher quality often incurs a lower overall cost. Paying market rates and hiring great people is the way to go, especially in the age of AI-powered interns.\nQ9: Are there any critical skill gaps among current personnel (especially around data/AI/ML)? If you’re asking this question as a candidate, you’re probably going to fill one of the gaps. However, gaps are relative to what the startup is trying to do. For example, if they have ambitious AI/ML plans that require a range of data skills they don’t have on the current team (from data engineering through data science to AI/ML engineering), they better be planning to hire more than one junior data generalist.\nQ10: What’s the hiring roadmap for the next 6-12-24 months? How will it affect the runway? Is it dependent on new funding or revenue growth? Startup founders usually have grand plans – that’s what you want from founders! But plans for 12-24 months are often in the realm of wishful thinking, and a lot can change even in six months. As a candidate, try to get a realistic view of the hiring that is highly likely to happen, along with the hiring that is dependent on new money coming in. Assuming that the latter doesn’t happen due to a cashflow crunch, would you still take the job?\nEven more questions? This post is part of a series on my Data-to-AI Health Check for Startups. Previous posts:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role You can download a guide containing all the questions as a PDF. The next area I’ll cover is Culture – how people work together. Feedback is always welcome!\n","wordCount":"1250","inLanguage":"en","image":"https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions.webp","datePublished":"2024-05-13T02:00:00Z","dateModified":"2024-05-13T12:41:01+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Probing the People aspects of an early-stage startup</h1><div class=post-meta><span title='2024-05-13 02:00:00 +0000 UTC'>May 13, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions_hu1028389f734f7152b4990d59636e3cb8_84918_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions_hu1028389f734f7152b4990d59636e3cb8_84918_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions_hu1028389f734f7152b4990d59636e3cb8_84918_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions_hu1028389f734f7152b4990d59636e3cb8_84918_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/startup-people-questions.webp alt="startupy people in a startupy space around a massive question mark (good old ChatGPT...)" width=1200 height=630></figure><div class=post-content><p>A successful startup is fundamentally a group of people who execute well together. Building a viable product is key, but changing the product is easier than changing the founders. If you&rsquo;re considering a role at an early-stage startup, you will be investing your best waking hours in the company. It&rsquo;s important to assess the team before making such an investment.</p><p>To learn about the team, ask questions from the People section of my <a href=https://yanirseroussi.com/data-to-ai-health-check/>Data-to-AI Health Check for Startups</a>. Some questions emphasise probing Data/AI/ML capabilities – an area that&rsquo;s often misunderstood by non-specialists. However, this emphasis can be shifted to different functional areas as needed. Similarly to <a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>my previous post on scrutinising the Product & Business Model</a>, the rest of this post lists my questions along with brief opinionated explanations.</p><h2 id=people-questions>People questions<a hidden class=anchor aria-hidden=true href=#people-questions>#</a></h2><p><strong>Q1: Who are the founders? What are their skills and experience?</strong> Founders make or break a startup. It&rsquo;s important to gain confidence that they have the skills and experience required to build the company, along with the mindset needed to keep learning and developing relevant skills. Founders who were previously successful are an especially positive sign – it indicates that they have the persistence and flexibility needed to build a business.</p><p><strong>Q2: What motivates the founders? How passionate are they about the startup&rsquo;s problem space?</strong> My favourite founders are those who build a business based on their deep understanding of customer problems in an area they deeply care about. For example, I previously worked with <a href=https://www.orkestra.energy/ target=_blank rel=noopener>Orkestra</a> – a software-as-a-service startup that grew directly out of the founders&rsquo; experience as consultants. In Orkestra&rsquo;s case, the founders had already spent years working together solving customer problems prior to founding the company. By contrast, some startups are founded by near-strangers just because the founders want to build <em>something</em> – a red flag.</p><p><strong>Q3: Have any founders left? Why and how?</strong> <a href=https://sparktoro.com/blog/the-final-chapter-of-my-first-startup/ target=_blank rel=noopener>Startups can turn friends into foes</a>. <a href=https://longform.asmartbear.com/avoid-blundering/ target=_blank rel=noopener>Foes holding a significant share of the company may lead to its destruction</a>. But even in cases where founders leave on good terms without significant equity, their departure stories help understand founder personalities and the trajectory of the company. For example, if you&rsquo;re considering a full-time position, and the story of how the remaining founders treated departing founders gives you pause, you&rsquo;re better off working elsewhere.</p><p><strong>Q4: Who are the key employees?</strong> Early employees are almost as important to startup success as founders. In fact, I know of multiple cases where employees &ldquo;became&rdquo; founders even though they weren&rsquo;t there from day one. For example, after my PhD I joined <a href=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/>Giveable</a> as a founding data scientist. As the first employee, I was in charge of building the backend for Giveable&rsquo;s B2C gift recommendation web app. Due to market conditions, we pivoted to a B2B recommender-as-a-service offering – not what the original founder had envisioned. He decided to move on, and I was left with much more equity than originally planned, along with the rights to the code. While I could have kept going as a &ldquo;founder&rdquo;, <a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>I decided to use the codebase to continue building the same B2B product as part of a more established ecommerce startup</a>.</p><p><strong>Q5: Have any key employees left (including involuntarily)? Why and how?</strong> With early employees being almost as important to startup success as founders, stories of their departures can be as informative as stories of founder departures. If you&rsquo;re considering a startup job, these stories can tell you a lot about founder-employee dynamics, <em>before</em> you become an employee. If you&rsquo;re especially thorough, you can even reach out to the former employees to get their side of the story. Positive signs include low employee turnover and founders who are comfortable with you speaking to their former employees.</p><p><strong>Q6: How committed are the founders and key employees (partly measured by work time spent on the startup)?</strong> Early on, it&rsquo;s common for founders and employees to be involved on a part-time basis. This is fine, but if you&rsquo;re going to commit a significant chunk of your time to the startup, you need to know who you&rsquo;ll be working with. As an employee, you can usually ignore the big names that are listed as advisors on the startup&rsquo;s website – their involvement is typically minimal. That said, both advisors and fractional contractors provide access to expertise and connections that may not be necessary on a full-time basis. In fact, <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>fractional help is much better than premature hiring, which unnecessarily burns through funding</a>. The main things to look at in an answer to the commitment question are: (1) transparency; and (2) that committed staff have the skills needed to achieve the next milestones.</p><p><strong>Q7: What hiring practices do you follow? How do you assess the skills of new experts (e.g., <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>first data hire</a>)?</strong> Given the importance of early employees, a loose hiring process is a cause for alarm. However, thoughtlessly borrowing hiring practices from the likes of Google is also problematic, as <a href=https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/>such processes are laughably hackable</a> and tedious to everyone involved. Startups can and should move faster on hiring than established players: My favourite hiring processes include <a href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/>paid work on real problems</a> after an initial low-cost filter. These are hard to scale, but there&rsquo;s no need to scale hiring in the early days. Paying for work on real problems also helps address the challenge of assessing the skills of new experts – they are judged on real work output rather than on confidence, pedigree, and performance on convoluted tasks.</p><p><strong>Q8: Do you pay market rates?</strong> Startups that don&rsquo;t pay market rates are best avoided. They&rsquo;re unlikely to attract and retain quality employees. Founders of such startups may also fall victim to classic fallacies like <a href=https://en.wikipedia.org/wiki/The_Mythical_Man-Month target=_blank rel=noopener>the 1975 Mythical Man-Month</a>, and make expensive mistakes like hiring two mediocre engineers in place of one excellent engineer. When it comes to software (and data) development, <a href=https://martinfowler.com/articles/is-quality-worth-cost.html target=_blank rel=noopener>higher quality often incurs a lower overall cost</a>. Paying market rates and hiring great people is the way to go, especially in the age of <a href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/>AI-powered interns</a>.</p><p><strong>Q9: Are there any critical skill gaps among current personnel (especially around data/AI/ML)?</strong> If you&rsquo;re asking this question as a candidate, you&rsquo;re probably going to fill one of the gaps. However, gaps are relative to what the startup is trying to do. For example, if <a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>they have ambitious AI/ML plans</a> that require a range of data skills they don&rsquo;t have on the current team (from data engineering through data science to AI/ML engineering), they better be planning to hire more than one junior data generalist.</p><p><strong>Q10: What&rsquo;s the hiring roadmap for the next 6-12-24 months? How will it affect the runway? Is it dependent on new funding or revenue growth?</strong> Startup founders usually have grand plans – that&rsquo;s what you want from founders! But plans for 12-24 months are often in the realm of wishful thinking, and a lot can change even in six months. As a candidate, try to get a realistic view of the hiring that is highly likely to happen, along with the hiring that is dependent on new money coming in. Assuming that the latter doesn&rsquo;t happen due to a cashflow crunch, would you still take the job?</p><h2 id=even-more-questions>Even more questions?<a hidden class=anchor aria-hidden=true href=#even-more-questions>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. The next area I&rsquo;ll cover is Culture – how people work together. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on x" href="https://x.com/intent/tweet/?text=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f&amp;hashtags=business%2ccareer%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f&amp;title=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup&amp;summary=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f&title=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on whatsapp" href="https://api.whatsapp.com/send?text=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on telegram" href="https://telegram.me/share/url?text=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Probing the People aspects of an early-stage startup on ycombinator" href="https://news.ycombinator.com/submitlink?t=Probing%20the%20People%20aspects%20of%20an%20early-stage%20startup&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f13%2fprobing-the-people-aspects-of-an-early-stage-startup%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/index.html b/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/index.html
index ee0cb0b87..37590fc47 100644
--- a/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/index.html
+++ b/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,data strategy,startups"><meta name=description content="Eight questions that prospective data-to-AI employees should ask about a startup&rsquo;s work and data culture."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Question startup culture before accepting a data-to-AI role"><meta property="og:description" content="Eight questions that prospective data-to-AI employees should ask about a startup&rsquo;s work and data culture."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/"><meta property="og:image" content="https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-05-20T02:25:00+00:00"><meta property="article:modified_time" content="2024-05-21T17:08:32+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out.webp"><meta name=twitter:title content="Question startup culture before accepting a data-to-AI role"><meta name=twitter:description content="Eight questions that prospective data-to-AI employees should ask about a startup&rsquo;s work and data culture."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Question startup culture before accepting a data-to-AI role","item":"https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Question startup culture before accepting a data-to-AI role","name":"Question startup culture before accepting a data-to-AI role","description":"Eight questions that prospective data-to-AI employees should ask about a startup\u0026rsquo;s work and data culture.","keywords":["business","career","data strategy","startups"],"articleBody":"AI shmAI. If you’ve been paying any attention, you’d know that for the vast majority of AI/ML projects, the real value comes from the data.\nAnd if you’re considering a role with a startup that has grand AI plans, you’d better learn about the startup’s work culture – which includes its data culture.\nTo help you, this post discusses the questions from the Culture section of my Data-to-AI Health Check for Startups. Let’s jump right into it.\nQ1: How often are people expected to work outside normal business hours (founders included)? Is unreasonable overtime compensated? “It’s a startup” isn’t a valid excuse for constant overwork. Most successful startups require a sustained effort over many years, and working at capacity reduces productivity over time. That said, high-effort spikes are inevitable – but unusual efforts should be recognised and compensated.\nQ2: Do people go on leave regularly (founders included)? This probes a similar cultural norm around overwork to Q1. Stay away from places where people never go on leave. It leads to burnout and collective stupidity: Knowledge workers need downtime to take a step back and come up with new creative ideas. Humans are not AIs.\nQ3: How do employees view the leadership team and founders? A small startup won’t have significant quantitative data on employee views. Even at larger companies, employee surveys are often designed and administered in a way that masks problems. If you’re considering a role with a startup, ask to speak with current employees to learn about their views on the culture, founders, and the company’s prospects. If the company is established, sites like Glassdoor and Blind can help you probe issues beyond the current employee base. In any case, remember that there’s an inherent selection bias in only questioning current staff, which is why it’s worth learning about former employees and ex-founders.\nQ4: How are wins celebrated? How are failures and mistakes analysed? The unfortunate reality is that many startup founders have little experience running or working at a startup. Therefore, they may not appreciate the need to celebrate wins or to take time to learn from failures. Rather than asking about general rituals, you could ask for examples: What did you do after the last big release? What did you learn from the latest outage? How will you mitigate similar outages?\nQ5: How is excellent/poor individual performance evaluated and handled? If you’ve worked anywhere at any capacity, you’d know that the following is true: (1) underappreciated excellent employees may leave; and (2) poor performers may drag an entire team down. Before you join a growing startup, it pays to know that founders have put some thought into performance management – especially if excellence is one of your core values.\nQ6: Does the company run data-informed experiments (like A/B tests)? If so, what is considered a successful experiment? For example, what happens if a well-run experiment produces results that contradict the CEO’s opinion? Finally, a question that directly addresses data culture! If you are considering a data role, a culture of intelligent experimentation is a positive sign that the startup is right for you. The correct definition of a successful experiment is “an experiment that taught us something new”. The common answer of “an experiment that confirmed our preconceived notions” (or in A/B testing terms: “an experiment where we shipped the test variation”) is absolutely wrong.\nQ7: Do leaders at the company explicitly seek truthful data, even when the truth may expose their mistakes? As with Q4, this may be best probed by asking leaders for examples of cases where they uncovered data that proved them wrong. Startups that harbour a culture of hiding from bad news are best avoided by excellent data people. In my experience and based on countless stories by friends, avoidance of bad news and truthful data becomes more common as companies grow. Great startup leaders care about the success of their business and know that hiding from the truth isn’t going to make it disappear.\nQ8: How is uncertainty quantified and communicated? How does it affect decisions? Common sources of uncertainty include sampling biases and missing or wrong data. Marketers are especially notorious for ignoring uncertainty for the sake of memorability (“nine out of ten doctors agree…”). But ignoring uncertainty has long been a way of getting data driven off a cliff. This is at the core of why I recommend the Calling Bullshit book and course to any aspiring data professional. Don’t work with startups that exhibit bullshit failure modes and ignore uncertainty – unless you have the mandate to shape the data culture for the better.\nData-to-AI health beyond culture This post is part of a series on my Data-to-AI Health Check for Startups. Previous posts:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role Probing the People aspects of an early-stage startup You can download a guide containing all the questions as a PDF. The next area I’ll cover is Processes \u0026 Project Management – aspects of delivery that are more formal than the somewhat-intangible Culture. Feedback is always welcome!\n","wordCount":"850","inLanguage":"en","image":"https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out.webp","datePublished":"2024-05-20T02:25:00Z","dateModified":"2024-05-21T17:08:32+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Question startup culture before accepting a data-to-AI role</h1><div class=post-meta><span title='2024-05-20 02:25:00 +0000 UTC'>May 20, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out_huc8c26eba9e1f31eddb05f21aa203c416_168308_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out_huc8c26eba9e1f31eddb05f21aa203c416_168308_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out_huc8c26eba9e1f31eddb05f21aa203c416_168308_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out_huc8c26eba9e1f31eddb05f21aa203c416_168308_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/data-garbage-in-garbage-out.webp alt="an illustration of the data 'garbage in, garbage out' concept" width=1200 height=630></figure><div class=post-content><p>AI shmAI. If you&rsquo;ve been paying any attention, you&rsquo;d know that for the vast majority of AI/ML projects, the real value comes from the data.</p><p>And if you&rsquo;re considering a role with a startup that has grand AI plans, you&rsquo;d better learn about the startup&rsquo;s work culture – which includes its data culture.</p><p>To help you, this post discusses the questions from the Culture section of <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Let&rsquo;s jump right into it.</p><p><strong>Q1: How often are people expected to work outside normal business hours (founders included)? Is unreasonable overtime compensated?</strong> <em>&ldquo;It&rsquo;s a startup&rdquo;</em> isn&rsquo;t a valid excuse for constant overwork. Most successful startups require a sustained effort over many years, and <a href=https://longform.asmartbear.com/utilization/ target=_blank rel=noopener>working at capacity reduces productivity over time</a>. That said, high-effort spikes are inevitable – but unusual efforts should be recognised and compensated.</p><p><strong>Q2: Do people go on leave regularly (founders included)?</strong> This probes a similar cultural norm around overwork to Q1. Stay away from places where people never go on leave. It leads to burnout and collective stupidity: Knowledge workers <em>need</em> downtime to take a step back and come up with new creative ideas. Humans are not AIs.</p><p><strong>Q3: How do employees view the leadership team and founders?</strong> A small startup won&rsquo;t have significant quantitative data on employee views. Even at larger companies, employee surveys are often designed and administered in a way that masks problems. If you&rsquo;re considering a role with a startup, ask to speak with current employees to learn about their views on the culture, founders, and the company&rsquo;s prospects. If the company is established, sites like Glassdoor and Blind can help you probe issues beyond the current employee base. In any case, remember that there&rsquo;s an inherent selection bias in only questioning current staff, which is why it&rsquo;s worth <a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>learning about former employees and ex-founders</a>.</p><p><strong>Q4: How are wins celebrated? How are failures and mistakes analysed?</strong> The unfortunate reality is that many startup founders have little experience running or working at a startup. Therefore, they may not appreciate the need to celebrate wins or to take time to learn from failures. Rather than asking about general rituals, you could ask for examples: <em>What did you do after the last big release? What did you learn from the latest outage? How will you mitigate similar outages?</em></p><p><strong>Q5: How is excellent/poor individual performance evaluated and handled?</strong> If you&rsquo;ve worked anywhere at any capacity, you&rsquo;d know that the following is true: (1) underappreciated excellent employees may leave; and (2) poor performers may drag an entire team down. Before you join a growing startup, it pays to know that founders have put some thought into performance management – especially if excellence is one of your core values.</p><p><strong>Q6: Does the company run data-informed experiments (like A/B tests)? If so, what is considered a successful experiment? For example, what happens if a well-run experiment produces results that contradict the CEO&rsquo;s opinion?</strong> Finally, a question that directly addresses data culture! If you are considering a data role, a culture of intelligent experimentation is a positive sign that the startup is right for you. The correct definition of a <em>successful experiment</em> is <em>&ldquo;an experiment that taught us something new&rdquo;</em>. The common answer of <em>&ldquo;an experiment that confirmed our preconceived notions&rdquo;</em> (or in A/B testing terms: <em>&ldquo;an experiment where we shipped the test variation&rdquo;</em>) is absolutely wrong.</p><p><strong>Q7: Do leaders at the company explicitly seek truthful data, even when the truth may expose their mistakes?</strong> As with Q4, this may be best probed by asking leaders for examples of cases where they uncovered data that proved them wrong. Startups that harbour a culture of hiding from bad news are best avoided by excellent data people. In my experience and based on countless stories by friends, avoidance of bad news and truthful data becomes more common as companies grow. Great startup leaders care about the success of their business and know that hiding from the truth isn&rsquo;t going to make it disappear.</p><p><strong>Q8: How is uncertainty quantified and communicated? How does it affect decisions?</strong> Common sources of uncertainty include sampling biases and missing or wrong data. Marketers are especially notorious for ignoring uncertainty for the sake of memorability (<em><a href=https://tvtropes.org/pmwiki/pmwiki.php/Main/NineOutOfTenDoctorsAgree target=_blank rel=noopener>&ldquo;nine out of ten doctors agree&mldr;&rdquo;</a></em>). But ignoring uncertainty has long been a way of <a href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/>getting data driven off a cliff</a>. This is at the core of why I recommend <a href=https://callingbullshit.org/ target=_blank rel=noopener>the Calling Bullshit book and course</a> to <a href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/>any aspiring data professional</a>. Don&rsquo;t work with startups that exhibit bullshit failure modes and ignore uncertainty – unless you have the mandate to shape the data culture for the better.</p><h2 id=data-to-ai-health-beyond-culture>Data-to-AI health beyond culture<a hidden class=anchor aria-hidden=true href=#data-to-ai-health-beyond-culture>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. The next area I&rsquo;ll cover is Processes & Project Management – aspects of delivery that are more formal than the somewhat-intangible Culture. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on x" href="https://x.com/intent/tweet/?text=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f&amp;hashtags=business%2ccareer%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f&amp;title=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role&amp;summary=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f&title=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on whatsapp" href="https://api.whatsapp.com/send?text=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on telegram" href="https://telegram.me/share/url?text=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Question startup culture before accepting a data-to-AI role on ycombinator" href="https://news.ycombinator.com/submitlink?t=Question%20startup%20culture%20before%20accepting%20a%20data-to-AI%20role&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f20%2fquestion-startup-culture-before-accepting-a-data-to-ai-role%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/index.html b/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/index.html
index 215dfd847..136ff31d6 100644
--- a/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/index.html
+++ b/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,business,career,data engineering,data science,data strategy,startups"><meta name=description content="Three essential questions to understand where an organisation stands when it comes to Data & AI (with zero hype)."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Plumbing, Decisions, and Automation: De-hyping Data & AI"><meta property="og:description" content="Three essential questions to understand where an organisation stands when it comes to Data & AI (with zero hype)."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/"><meta property="og:image" content="https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-05-27T02:00:00+00:00"><meta property="article:modified_time" content="2024-05-27T12:25:30+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter.webp"><meta name=twitter:title content="Plumbing, Decisions, and Automation: De-hyping Data & AI"><meta name=twitter:description content="Three essential questions to understand where an organisation stands when it comes to Data & AI (with zero hype)."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI","item":"https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI","name":"Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI","description":"Three essential questions to understand where an organisation stands when it comes to Data \u0026amp; AI (with zero hype).","keywords":["artificial intelligence","business","career","data engineering","data science","data strategy","startups"],"articleBody":"Data \u0026 AI health is hard to define. Recently, it occurred to me that its essence can be distilled with three questions:\nPlumbing: What’s the state of your data engineering lifecycles? Decisions: How do you use descriptive, predictive, and causal modelling to support decisions? Automation: How do you use AI to automate processes? These questions help identify gaps and opportunities. While each question focuses on the present state, it’s natural to follow up with plans for a brighter future.\nIn practice, you would go deep on each area. Each question is a door that leads to a corridor with many more doors.\nAmateurs versus professionals If you’ve ever worked with data, you’d have a sense of what amateur and professional answers to the above questions may look like. In practice, answers are multifaceted and fall on a continuum. But here are some simplified examples from each end of the continuum:\nAmateur Professional Plumbing Rudimentary pipelines, manually-populated spreadsheets All necessary data is trustworthy and available on tap Decisions Relying on one-off charts and models, along with the intuition of HiPPOs (highest-paid persons’ opinions) Relying on relevant data and modelling efforts that are proportional to the gravity of each decision Automation Superficial use of off-the-shelf tools Deep, mindful integration of tech to replace manual work where it delivers the most value Going down the rabbit hole The three areas pretty much define my career, but there is always much more to learn. The main message of this post is that little has changed since Harrington Emerson uttered these words in 1911:\nAs to methods, there may be a million and then some, but principles are few. The person who grasps principles can successfully select their own methods. The person who tries methods, ignoring principles, is sure to have trouble.\n(OK, one thing did change – Emerson used man rather than person, but I fixed it for him.)\nYou can explore further with these posts:\nPlumbing: Fully understanding the data engineering lifecycle is more important than mastering a single tool. Decisions: According to my 2018 definition, this is what data science is all about. There’s endless depth to building descriptive, predictive, and causal models. But the key to rising above tool hype is understanding the why of data science, which is to support decisions. Automation: The term AI is around peak hype right now. This makes it easy for cynics to dismiss the over-excited claims of AI proponents. Avoid cynicism – simply think of AI as automation and understand that relentless but mindful automation is key to success in our world. More questions to probe the Data-to-AI health of startups This post is a slight detour from the series on my Data-to-AI Health Check for Startups. I figured it’s a valuable detour since I now see the triad of Plumbing, Decisions, and Automation as the essence of Data \u0026 AI health for any organisation.\nPrevious posts in the series:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role Probing the People aspects of an early-stage startup Question startup culture before accepting a data-to-AI role You can download a guide containing all the questions as a PDF. I’m still planning to cover Processes \u0026 Project Management next – hopefully I won’t get detoured again. Feedback is always welcome!\n","wordCount":"553","inLanguage":"en","image":"https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter.webp","datePublished":"2024-05-27T02:00:00Z","dateModified":"2024-05-27T12:25:30+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Plumbing, Decisions, and Automation: De-hyping Data & AI</h1><div class=post-meta><span title='2024-05-27 02:00:00 +0000 UTC'>May 27, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter_huc3a2a5fb956faf388d9daac41939126e_69648_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter_huc3a2a5fb956faf388d9daac41939126e_69648_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter_huc3a2a5fb956faf388d9daac41939126e_69648_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter_huc3a2a5fb956faf388d9daac41939126e_69648_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/amateur-versus-professional-data-and-ai-otter.webp alt="contrasting an amateur and a professional otter; the amateur asks about tools, the professional asks about plumbing, decisions, and automation" width=1200 height=630></figure><div class=post-content><p>Data & AI health is hard to define. Recently, it occurred to me that its essence can be distilled with three questions:</p><ol><li><strong>Plumbing:</strong> What&rsquo;s the state of your data engineering lifecycles?</li><li><strong>Decisions:</strong> How do you use descriptive, predictive, and causal modelling to support decisions?</li><li><strong>Automation:</strong> How do you use AI to automate processes?</li></ol><p>These questions help identify gaps and opportunities. While each question focuses on the present state, it&rsquo;s natural to follow up with plans for a brighter future.</p><p>In practice, you would go deep on each area. Each question is a door that leads to a corridor with many more doors.</p><h2 id=amateurs-versus-professionals>Amateurs versus professionals<a hidden class=anchor aria-hidden=true href=#amateurs-versus-professionals>#</a></h2><p>If you&rsquo;ve ever worked with data, you&rsquo;d have a sense of what amateur and professional answers to the above questions may look like. In practice, answers are multifaceted and fall on a continuum. But here are some simplified examples from each end of the continuum:</p><table><thead><tr><th></th><th>Amateur</th><th>Professional</th></tr></thead><tbody><tr><td><strong>Plumbing</strong></td><td>Rudimentary pipelines, manually-populated spreadsheets</td><td>All necessary data is trustworthy and available on tap</td></tr><tr><td><strong>Decisions</strong></td><td>Relying on one-off charts and models, along with the intuition of HiPPOs (highest-paid persons&rsquo; opinions)</td><td>Relying on relevant data and modelling efforts that are proportional to the gravity of each decision</td></tr><tr><td><strong>Automation</strong></td><td>Superficial use of off-the-shelf tools</td><td>Deep, mindful integration of tech to replace manual work where it delivers the most value</td></tr></tbody></table><h2 id=going-down-the-rabbit-hole>Going down the rabbit hole<a hidden class=anchor aria-hidden=true href=#going-down-the-rabbit-hole>#</a></h2><p>The three areas pretty much define my career, but there is always much more to learn. The main message of this post is that little has changed since <a href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/>Harrington Emerson uttered these words in 1911</a>:</p><blockquote><p>As to methods, there may be a million and then some, but principles are few. The person who grasps principles can successfully select their own methods. The person who tries methods, ignoring principles, is sure to have trouble.</p></blockquote><p><small>(OK, one thing did change – Emerson used <em>man</em> rather than <em>person</em>, but I fixed it for him.)</small></p><p>You can explore further with these posts:</p><ol><li><strong>Plumbing:</strong> Fully understanding <a href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/>the data engineering lifecycle</a> is more important than mastering a single tool.</li><li><strong>Decisions:</strong> According to <a href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/>my 2018 definition</a>, this is what data science is all about. There&rsquo;s endless depth to building descriptive, predictive, and causal models. But the key to rising above tool hype is understanding <a href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/>the <em>why</em> of data science</a>, which is to support decisions.</li><li><strong>Automation:</strong> The term <em>AI</em> is around peak hype right now. This makes it easy for cynics to dismiss the over-excited claims of AI proponents. Avoid cynicism – <a href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/>simply think of AI as automation</a> and <a href=https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/>understand that relentless but mindful automation is key to success in our world</a>.</li></ol><h2 id=more-questions-to-probe-the-data-to-ai-health-of-startups>More questions to probe the Data-to-AI health of startups<a hidden class=anchor aria-hidden=true href=#more-questions-to-probe-the-data-to-ai-health-of-startups>#</a></h2><p>This post is a slight detour from the series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. I figured it&rsquo;s a valuable detour since I now see the triad of Plumbing, Decisions, and Automation as the essence of Data & AI health for any organisation.</p><p>Previous posts in the series:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li><li><a href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/>Question startup culture before accepting a data-to-AI role</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. I&rsquo;m still planning to cover Processes & Project Management next – hopefully I won&rsquo;t get detoured again. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on x" href="https://x.com/intent/tweet/?text=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f&amp;hashtags=artificialintelligence%2cbusiness%2ccareer%2cdataengineering%2cdatascience%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f&amp;title=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI&amp;summary=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f&title=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on whatsapp" href="https://api.whatsapp.com/send?text=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on telegram" href="https://telegram.me/share/url?text=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Plumbing, Decisions, and Automation: De-hyping Data & AI on ycombinator" href="https://news.ycombinator.com/submitlink?t=Plumbing%2c%20Decisions%2c%20and%20Automation%3a%20De-hyping%20Data%20%26%20AI&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f05%2f27%2fplumbing-decisions-and-automation-de-hyping-data-and-ai%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/06/03/how-to-avoid-startups-with-poor-development-processes/index.html b/2024/06/03/how-to-avoid-startups-with-poor-development-processes/index.html
index 361ae420d..abe7b0a7b 100644
--- a/2024/06/03/how-to-avoid-startups-with-poor-development-processes/index.html
+++ b/2024/06/03/how-to-avoid-startups-with-poor-development-processes/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,data strategy,software engineering,startups"><meta name=description content="Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="How to avoid startups with poor development processes"><meta property="og:description" content="Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/"><meta property="og:image" content="https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-06-03T02:45:00+00:00"><meta property="article:modified_time" content="2024-06-03T12:58:00+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often.webp"><meta name=twitter:title content="How to avoid startups with poor development processes"><meta name=twitter:description content="Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"How to avoid startups with poor development processes","item":"https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"How to avoid startups with poor development processes","name":"How to avoid startups with poor development processes","description":"Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role.","keywords":["business","career","data strategy","software engineering","startups"],"articleBody":"Many founders have never worked at a startup. This may make them oblivious to failure modes that arise from poor development processes. With poor processes, even the most brilliant people are ineffective, as they’re constantly wasting time and fighting fires.\nYou don’t want to join a startup that’s mired in intractable chaos. To avoid such a place, ask questions from the Processes \u0026 Project Management section of my Data-to-AI Health Check for Startups. Do this even if you’re a data scientist or a junior engineer – because everyone suffers in an environment with poor processes.\nThis post briefly explains each question, along with expected answers and suggestions for eliciting informative responses.\nThe questions Q1: How often are changes shipped to production? Even in 2024, there are software startups that don’t follow the RERO philosophy: “Release early. Release often. And listen to your customers.” Sporadic releases are often indicative of poor processes. Stay away from such startups, unless you have a mandate to improve things.\nQ2: How is the impact of new features quantified before and after their release? Another classic quote in the spirit of RERO is the first principle behind the Agile Manifesto: “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.” And you can’t know what’s valuable software if you don’t collect data from your users and meaningfully aggregate it. Helping with this is likely to be your responsibility if you’re the first data hire – ensure that the founders understand that.\nQ3: What processes and systems are in place to collect qualitative and quantitative feedback from users (both internal and external)? This takes a more holistic view of user feedback and includes internal users. A common failure mode for internal-facing data teams is to spend too much time satisfying low-value requests, e.g., build a dashboard that is barely used. Given the opportunity cost, it’s important to collect truthful feedback from all users.\nQ4: Were there any outages in recent months? How did they affect users? What changes did you implement to reduce the chance of outages and impact on users? Some outages are unavoidable, especially early on. If the answers reveal that there are repeated issues that go unaddressed, the team may be too busy fighting fires rather than shipping valuable software. This applies to internal systems as well, e.g., data ingestion pipelines that keep breaking shouldn’t be seen as normal.\nQ5: What system do you use for prioritising and tracking work across the company? This includes project management tools, and their use in practice – but any tool can be misused. The important things to look for are that: (1) a system exists; and (2) the system will improve over time.\nQ6: How do you balance paying down tech debt and shipping new features? Tech debt is unavoidable, especially in a fast-growing startup. You’re looking for an acknowledgement of this fact, along with an understanding that accruing too much tech debt reduces the ability to ship new features. While the concept comes from software engineering, it also applies to product analytics: if the data is a tangled mess and pipelines are unreliable (i.e., data tech debt is high), then analytics can’t be trusted to help improve the product.\nQ7: What proportion of the engineering and data staff time is spent on: (1) dealing with bugs and incidents; (2) meetings and admin overheads; and (3) shipping new features? This question takes Q5 \u0026 Q6 from philosophy to practice – “don’t tell me your priorities; show me your calendar”. If individual contributors don’t spend most of their time shipping new features (including research into the necessity of these features), there may be too much tech debt or too many overheads.\nQ8: What are the key team rituals (e.g., recurring meetings, sprint planning, demos, postmortems, standups)? Again, the calendar is a source of insights. For early-stage startups, awareness of the need for rituals may be low, and they may be implemented poorly. For example, I believe that daily synchronous standups are a bad idea for remote teams. A better approach is using a bot that asks everyone for their progress, upcoming tasks, and blockers – and following up to ensure that everyone is on track as a team.\nQ9: What processes are in place for code, design, and architecture reviews? The depth and time of each process should be proportional to the magnitude and impact of the change. For example, a trivial code improvement in a product that barely has any users can be shipped with a post-commit review. In contrast, migrating a live product to a different database system requires deeper consultation.\nQ10: Are there different processes for internal-facing data products (e.g., custom admin dashboards)? The unfortunate reality of many organisations is that data analytics and software engineering live in different silos. This can easily happen at small startups as well, with a single analyst working in isolation from product teams. I believe that the ideal situation is that internal-facing data products are treated like the software products that they are. They may not need to be as pretty as external-facing products, but their development should follow processes that ensure high quality, trustworthiness, and satisfaction of user needs.\nQ11: Give an example of how the above items manifest in a big project that was recently completed. It’s often hard to speak of abstract processes. An example of a big project may be the best way of moving from the abstract to the everyday reality of the startup.\nQ12: Are there any gaps in the current processes or changes you’d like to introduce? This probes for a growth mindset. If the answer is no, there’s probably something wrong.\nWhat if you can’t fit in all the questions? If you’re simply trying to avoid a dysfunctional startup rather than dive deep into development processes, a subset of the questions will suffice:\nQ1 + Q2: Ensure that products are continuously improving based on user feedback. Q7: Ensure that time spent by current staff aligns with how you want to spend your time. Q11 + Q12: Ensure that leaders are conscious of the need to implement and improve processes. Data-to-AI health beyond processes This post is part of a series on my Data-to-AI Health Check for Startups. Previous posts:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role Probing the People aspects of an early-stage startup Question startup culture before accepting a data-to-AI role Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI You can download a guide containing all the questions as a PDF. The next area of the health check is Data (finally!). Feedback is always welcome!\n","wordCount":"1112","inLanguage":"en","image":"https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often.webp","datePublished":"2024-06-03T02:45:00Z","dateModified":"2024-06-03T12:58:00+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">How to avoid startups with poor development processes</h1><div class=post-meta><span title='2024-06-03 02:45:00 +0000 UTC'>June 3, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often_hu969dbd1a564f6a6647c1410ed2caeb5c_8340_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often_hu969dbd1a564f6a6647c1410ed2caeb5c_8340_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often_hu969dbd1a564f6a6647c1410ed2caeb5c_8340_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often_hu969dbd1a564f6a6647c1410ed2caeb5c_8340_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/release-early-release-often.webp alt="minimalist image of the phrase RERO: release early, release often" width=1200 height=630><p>Avoid those who release late, release seldom, and don&rsquo;t listen to their customers.</p></figure><div class=post-content><p>Many founders have never worked at a startup. This may make them oblivious to failure modes that arise from poor development processes. With poor processes, even the most brilliant people are ineffective, as they&rsquo;re constantly wasting time and fighting fires.</p><p>You don&rsquo;t want to join a startup that&rsquo;s mired in intractable chaos. To avoid such a place, ask questions from the Processes & Project Management section of <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Do this even if you&rsquo;re a data scientist or a junior engineer – because everyone suffers in an environment with poor processes.</p><p>This post briefly explains each question, along with expected answers and suggestions for eliciting informative responses.</p><h2 id=the-questions>The questions<a hidden class=anchor aria-hidden=true href=#the-questions>#</a></h2><p><strong>Q1: How often are changes shipped to production?</strong> Even in 2024, there are software startups that don&rsquo;t follow the <a href=https://en.wikipedia.org/wiki/Release_early,_release_often target=_blank rel=noopener>RERO philosophy</a>: <em>&ldquo;Release early. Release often. And listen to your customers.&rdquo;</em> Sporadic releases are often indicative of poor processes. Stay away from such startups, unless you have a mandate to improve things.</p><p><strong>Q2: How is the impact of new features quantified before and after their release?</strong> Another classic quote in the spirit of RERO is <a href=https://agilemanifesto.org/principles.html target=_blank rel=noopener>the first principle behind the Agile Manifesto</a>: <em>&ldquo;Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.&rdquo;</em> And you can&rsquo;t know what&rsquo;s <em>valuable software</em> if you don&rsquo;t collect data from your users and meaningfully aggregate it. Helping with this is likely to be your responsibility if you&rsquo;re <a href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/>the first data hire</a> – ensure that the founders understand that.</p><p><strong>Q3: What processes and systems are in place to collect qualitative and quantitative feedback from users (both internal and external)?</strong> This takes a more holistic view of user feedback and includes internal users. A common failure mode for internal-facing data teams is to spend too much time satisfying low-value requests, e.g., build a dashboard that is barely used. Given the opportunity cost, it&rsquo;s important to collect truthful feedback from <em>all</em> users.</p><p><strong>Q4: Were there any outages in recent months? How did they affect users? What changes did you implement to reduce the chance of outages and impact on users?</strong> Some outages are unavoidable, especially early on. If the answers reveal that there are repeated issues that go unaddressed, the team may be too busy fighting fires rather than shipping valuable software. This applies to internal systems as well, e.g., data ingestion pipelines that keep breaking shouldn&rsquo;t be seen as normal.</p><p><strong>Q5: What system do you use for prioritising and tracking work across the company?</strong> This includes project management tools, and their use in practice – but any tool can be misused. The important things to look for are that: (1) a system exists; and (2) the system will improve over time.</p><p><strong>Q6: How do you balance paying down tech debt and shipping new features?</strong> <a href=https://blog.codinghorror.com/paying-down-your-technical-debt/ target=_blank rel=noopener>Tech debt is unavoidable</a>, especially in a fast-growing startup. You&rsquo;re looking for an acknowledgement of this fact, along with an understanding that accruing too much tech debt reduces the ability to ship new features. While the concept comes from software engineering, it also applies to product analytics: if the data is a tangled mess and pipelines are unreliable (i.e., <em>data</em> tech debt is high), then analytics can&rsquo;t be trusted to help improve the product.</p><p><strong>Q7: What proportion of the engineering and data staff time is spent on: (1) dealing with bugs and incidents; (2) meetings and admin overheads; and (3) shipping new features?</strong> This question takes Q5 & Q6 from philosophy to practice – <a href=https://www.instagram.com/p/CrdiX-4OBlv/ target=_blank rel=noopener><em>&ldquo;don&rsquo;t tell me your priorities; show me your calendar&rdquo;</em></a>. If individual contributors don&rsquo;t spend most of their time shipping new features (including research into the necessity of these features), there may be too much tech debt or too many overheads.</p><p><strong>Q8: What are the key team rituals (e.g., recurring meetings, sprint planning, demos, postmortems, standups)?</strong> Again, the calendar is a source of insights. For early-stage startups, awareness of the need for rituals may be low, and they may be implemented poorly. For example, I believe that daily synchronous standups are a bad idea for remote teams. A better approach is using a bot that asks everyone for their progress, upcoming tasks, and blockers – <em>and following up to ensure that everyone is on track as a team</em>.</p><p><strong>Q9: What processes are in place for code, design, and architecture reviews?</strong> The depth and time of each process should be proportional to the magnitude and impact of the change. For example, a trivial code improvement in a product that barely has any users can be shipped with a post-commit review. In contrast, migrating a live product to a different database system requires deeper consultation.</p><p><strong>Q10: Are there different processes for internal-facing data products (e.g., custom admin dashboards)?</strong> The unfortunate reality of many organisations is that data analytics and software engineering live in different silos. This can easily happen at small startups as well, with a single analyst working in isolation from product teams. I believe that the ideal situation is that internal-facing data products are treated like the software products that they are. They may not need to be as pretty as external-facing products, but their development should follow processes that ensure high quality, trustworthiness, and satisfaction of user needs.</p><p><strong>Q11: Give an example of how the above items manifest in a big project that was recently completed.</strong> It&rsquo;s often hard to speak of abstract processes. An example of a big project may be the best way of moving from the abstract to the everyday reality of the startup.</p><p><strong>Q12: Are there any gaps in the current processes or changes you&rsquo;d like to introduce?</strong> This probes for a growth mindset. If the answer is <em>no</em>, there&rsquo;s probably something wrong.</p><h2 id=what-if-you-cant-fit-in-all-the-questions>What if you can&rsquo;t fit in all the questions?<a hidden class=anchor aria-hidden=true href=#what-if-you-cant-fit-in-all-the-questions>#</a></h2><p>If you&rsquo;re simply trying to avoid a dysfunctional startup rather than dive deep into development processes, a subset of the questions will suffice:</p><ul><li>Q1 + Q2: Ensure that products are continuously improving based on user feedback.</li><li>Q7: Ensure that time spent by current staff aligns with how you want to spend your time.</li><li>Q11 + Q12: Ensure that leaders are conscious of the need to implement and improve processes.</li></ul><h2 id=data-to-ai-health-beyond-processes>Data-to-AI health beyond processes<a hidden class=anchor aria-hidden=true href=#data-to-ai-health-beyond-processes>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li><li><a href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/>Question startup culture before accepting a data-to-AI role</a></li><li><a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>Plumbing, Decisions, and Automation: De-hyping Data & AI</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. The next area of the health check is Data (<em>finally!</em>). Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on x" href="https://x.com/intent/tweet/?text=How%20to%20avoid%20startups%20with%20poor%20development%20processes&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f&amp;hashtags=business%2ccareer%2cdatastrategy%2csoftwareengineering%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f&amp;title=How%20to%20avoid%20startups%20with%20poor%20development%20processes&amp;summary=How%20to%20avoid%20startups%20with%20poor%20development%20processes&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f&title=How%20to%20avoid%20startups%20with%20poor%20development%20processes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on whatsapp" href="https://api.whatsapp.com/send?text=How%20to%20avoid%20startups%20with%20poor%20development%20processes%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on telegram" href="https://telegram.me/share/url?text=How%20to%20avoid%20startups%20with%20poor%20development%20processes&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share How to avoid startups with poor development processes on ycombinator" href="https://news.ycombinator.com/submitlink?t=How%20to%20avoid%20startups%20with%20poor%20development%20processes&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f03%2fhow-to-avoid-startups-with-poor-development-processes%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/index.html b/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/index.html
index 19a890798..5feca2972 100644
--- a/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/index.html
+++ b/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/index.html
@@ -13,7 +13,7 @@
 </code></pre><p>Burmistrov&rsquo;s post is worth reading, as it delves deeper into specific examples of how such events are useful for observability and exploration of <em>unknown unknowns</em>: things you weren&rsquo;t aware would be interesting when the code emitting the event was written.</p><p>Having been on both the producing and consuming side of such event streams, I have a few thoughts to add:</p><ul><li><a href="https://news.ycombinator.com/item?id=39529775" target=_blank rel=noopener>A common objection to Burmistrov&rsquo;s post is the cost of tracking</a>. This depends on implementation specifics – there are ways to balance cost and usefulness of events. Further, if a startup is growing and more events are emitted, revenue and funding should also grow – which supports higher-volume tracking. Further, tracking costs should be compared to the opportunity cost of data gaps and to the cost of engineering time for cost-optimising the tracking system.</li><li>Many software engineers don&rsquo;t think in terms of events, as data models for production systems typically reflect the <em>current</em> state of the product rather than its entire history. This is unlikely to radically change, as this sort of data model makes sense in production. To bring engineers along for the event ride, it&rsquo;s worth getting them to read <a href=https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying target=_blank rel=noopener>the classic article about The Log by Jay Kreps</a>.</li><li>As a step towards data quality assurance by event producers, automated tests should ensure that events are emitted as expected. This is different from traditional logging, which is seen as a side effect of the system that doesn&rsquo;t require testing.</li></ul><h2 id=what-are-kukuyevas-five-business-aspects>What are Kukuyeva&rsquo;s five business aspects?<a hidden class=anchor aria-hidden=true href=#what-are-kukuyevas-five-business-aspects>#</a></h2><p>I came across <a href=https://www.ikukuyeva.com/ target=_blank rel=noopener>Irina Kukuyeva</a> last year, when I was looking for other people who are offering Chief Data Officer engagements. Her website is a treasure trove of advice for founders, investors, data consultants, and others.</p><p>On the topic of data that should be tracked, <a href=https://www.ikukuyeva.com/blog/Data-to-capture target=_blank rel=noopener>she has this to say</a>:</p><blockquote><p>After 10 years of collaborating with companies of all shapes, sizes and industries, I&rsquo;ve found that all aspects of your business fall into just 5 attributes that you should (legally and ethically) track across your platform/hardware/service and your customers, timestamped at the individual event level:</p><ol><li>&ldquo;Demographics&rdquo; of the customer&rsquo;s/IoT device (e.g., Android/iPhone, tablet, desktop, hardware type),</li><li>Your app&rsquo;s/platform&rsquo;s &ldquo;demographics&rdquo; (e.g., release 0.9.3, pricing plan(s)),</li><li>State(s) of the app&rsquo;s/platform&rsquo;s assets (e.g., inventory, sensor(s)),</li><li>Interactions of all of the above, and</li><li>Touch-points (e.g. acquisition channels, marketing emails, customer service interactions, sensor maintenance)</li></ol></blockquote><p>One thing I found confusing when I first read the article was the use of the word <em>attributes</em>, which commonly refers to specific event attributes. However, now I understand that not all aspects are tracked in each event – it&rsquo;s a broad overview of what should <em>ideally</em> be tracked across the business.</p><p>Importantly, when thinking of tracking at the logical level, <strong>the tools don&rsquo;t matter</strong>. This is exemplified by <a href=https://www.ikukuyeva.com/blog/founders/getting-started-data-strategy target=_blank rel=noopener>another Kukuyeva article that advises founders to get started on their data strategy</a> with tools that can be as simple as pen and paper.</p><p>For examples of how events fold up to analytics and other use cases, see <a href=https://substack.timodechau.com/p/eventify-everything-data-modeling target=_blank rel=noopener>Eventify Everything by Timo Dechau</a> and <a href=https://www.activityschema.com/ target=_blank rel=noopener>Activity Schema</a>. However, keep in mind <a href="https://news.ycombinator.com/item?id=39529775" target=_blank rel=noopener>this observation by Misha Panko of Motif Analytics</a>:</p><blockquote><p>It took the world decades to develop widely accepted standards for working with relational data and SQL. I believe we are at the early stages of doing the same with event data and sequence analytics. It is starting to simultaneously emerge in many different fields:</p><ul><li>eng observability (traces at Datadog, Sumologic, etc)</li><li>operational research (process mining at Celonis)</li><li>product analytics (funnels at Amplitude, Mixpanel)</li></ul><p>As with every new field, there are a lot of different and overlapping terms being suggested and explored at the same time.</p></blockquote><p>In short, even though systems for event stream processing like <a href=https://en.wikipedia.org/wiki/Apache_Kafka target=_blank rel=noopener>Apache Kafka</a> are now over a decade old, the industry is still figuring out how to best model and use all the event data. No one has all the answers, and your mileage will vary.</p><h2 id=whats-a-healthy-level-of-tracking>What&rsquo;s a healthy level of tracking?<a hidden class=anchor aria-hidden=true href=#whats-a-healthy-level-of-tracking>#</a></h2><p>Assuming compliance with relevant data privacy regulations, there&rsquo;s still a question of what tracking level is healthy. Despite the yes/no phrasing of my original question (<em>&ldquo;do you track Kukuyeva&rsquo;s five business aspects as wide events?&rdquo;</em>), responses are likely to fall on a continuum between 0 and 1:</p><ol start=0><li><strong>Obviously unhealthy:</strong> No, we don&rsquo;t track any events.</li><li><strong>Unrealistically healthy:</strong> Yes, we track everything and have high confidence that we know everything we need to know.</li></ol><p>Health is relative to the stage of the startup and its current goals. If data gaps are blocking growth or likely to start hurting growth in the 6-12 month horizon, then tracking is insufficiently healthy and should be addressed.</p><h2 id=data-to-ai-health-beyond-event-tracking>Data-to-AI health beyond event tracking<a hidden class=anchor aria-hidden=true href=#data-to-ai-health-beyond-event-tracking>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li><li><a href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/>Question startup culture before accepting a data-to-AI role</a></li><li><a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>Plumbing, Decisions, and Automation: De-hyping Data & AI</a></li><li><a href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/>How to avoid startups with poor development processes</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. Next, I&rsquo;ll go into the other questions in the Data section. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on x" href="https://x.com/intent/tweet/?text=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f&amp;hashtags=analytics%2cbusiness%2cdatascience%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f&amp;title=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking&amp;summary=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f&title=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on whatsapp" href="https://api.whatsapp.com/send?text=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on telegram" href="https://telegram.me/share/url?text=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Startup data health starts with healthy event tracking on ycombinator" href="https://news.ycombinator.com/submitlink?t=Startup%20data%20health%20starts%20with%20healthy%20event%20tracking&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f10%2fstartup-data-health-starts-with-healthy-event-tracking%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/06/17/ai-aint-gonna-save-you-from-bad-data/index.html b/2024/06/17/ai-aint-gonna-save-you-from-bad-data/index.html
index ec243da8d..ddbec92e8 100644
--- a/2024/06/17/ai-aint-gonna-save-you-from-bad-data/index.html
+++ b/2024/06/17/ai-aint-gonna-save-you-from-bad-data/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,data science,data strategy,startups"><meta name=description content="Since we&rsquo;re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="AI ain't gonna save you from bad data"><meta property="og:description" content="Since we&rsquo;re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/"><meta property="og:image" content="https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-06-17T02:00:00+00:00"><meta property="article:modified_time" content="2024-06-17T13:13:44+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster.webp"><meta name=twitter:title content="AI ain't gonna save you from bad data"><meta name=twitter:description content="Since we&rsquo;re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"AI ain't gonna save you from bad data","item":"https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"AI ain't gonna save you from bad data","name":"AI ain\u0027t gonna save you from bad data","description":"Since we\u0026rsquo;re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects.","keywords":["artificial intelligence","data science","data strategy","startups"],"articleBody":"Now that we have generative AI, we no longer need to worry about data, right? Well, we’re not quite there yet. On their own, ChatGPT, Gemini, and their friends can’t save us from bad decisions around data collection and modelling, or from poorly-designed metrics.\nWhile we wait for better AI agents to replace data scientists and engineers, I propose we ask a standard set of six questions about the data health of any project. These questions come from my Data-to-AI Health Check for Startups, but they apply anywhere. You can use them as a starting point when joining a new initiative, or to assess the state of an existing project.\nBefore we start, note that the goal is to identify gaps as opportunities for improvement. It’s easy to see data issues as insurmountable, but despair isn’t a viable data strategy. Aim to adopt Stockdale-style optimism when dealing with data:\nYou must never confuse faith that you will prevail in the end – which you can never afford to lose – with the discipline to confront the most brutal facts of your current reality, whatever they might be.\nAs with other posts in the health check series, this post provides a brief explanation for every question.\nLet’s jump in.\nThe questions Q1: Do you track Kukuyeva’s five business aspects as wide events?\nThis question is foundational, as inadequate instrumentation often makes data-informed decisions impossible. Given its importance, I wrote a separate post on this question.\nThe short story is that you need event-based tracking and event data modelling of key aspects of the business and product. Events are essentially timestamped mappings. For example, a customer purchase is an event that should be logged along with metadata on the customer and platform at the time of purchase.\nQ2: Is there data you need that isn’t collected or is inaccessible? What is stopping you from obtaining it?\nWhile Q1 covers high-level event tracking, Q2 brings it down to specific needs.\nSometimes, data can’t be collected due to practical or legal reasons. In some cases, other data can serve as a proxy. For example, companies are typically interested in measuring “customer satisfaction”, but asking directly about satisfaction suffers from a host of problems (e.g., not everyone responds, and the timing and phrasing of questions influence answers). Instead, customer satisfaction can be partly inferred from behaviour like repeat purchases – but that’s also not perfect.\nIn any case, seeking perfection in data is a recipe for disappointment. You should start with business needs and find the data to best address them within a reasonable timeframe.\nQ3: On a scale of 1 to 5, rate the quality of your key datasets. If you’re unsure due to limited observability and quality checks, it’s a 1.\nThis question should be answered by those who are closest to the data: usually data engineers, scientists, analysts, or one of the dozens of other titles data specialists go by. An experienced data specialist would have a subjective sense of data quality, so it’s worth agreeing on quality definitions if the rating is done by multiple people. Generally, a dataset is of high quality if it’s fit for its intended uses in decisions and automation.\nAgain, absolute perfection is impossible: Data is a model of the world, and all models are wrong, but some are useful.\nQ4: On a scale of 1 to 5, what is the confidence of stakeholders in the data and metrics that are used to make business decisions? Explain why.\nLow data quality often leads to low trust and confidence in the outputs of data specialists. However, that’s not always the case. Sometimes, stakeholders may have high confidence in metrics because they’re unaware of underlying data issues. In other cases, confidence is low due to historical reasons: Trust takes time to build – it is a trailing indicator of consistently making and keeping promises. Data specialists are sometimes enamoured with fancy tech and tools, neglecting simple wins that are at the foundation of data’s hierarchy of needs. After decades of hype (from big data through data science to AI agents), I can see why many people treat anything that falls under Data \u0026 AI with suspicion. If you’re a data specialist, serving your customers with relevant trustworthy data can be a rare delight.\nQ5: If you are currently using advanced AI/ML, do you have all the data you need for the models to perform as accurately as required?\nBy advanced AI/ML, I mean fine-tuning or building machine learning models from scratch. This is distinct from basic AI/ML, which relies on third-party models as black boxes. For example, calling a vision API to extract text from images is basic AI/ML. Training a model on your proprietary image data is advanced AI/ML. The latter requires data of sufficient quality for model accuracy, where sufficient accuracy depends on the application.\nImplicit in this question are satisfactory answers to Q1-Q3: You need to be tracking advanced AI/ML performance in the context where it’s used (Q1), have access to the data you need (Q2), and ensure that data quality is fit for advanced AI/ML (Q3). Advanced AI/ML is hard to do well, and failure can erode stakeholder trust (Q4). However, “failure” depends on expectations – AI/ML models are probabilistic, so setting the right expectations is key. For example, as ChatGPT has shown, it’s possible to build a useful consumer product on top of an AI/ML model that is often wrong.\nQ6: If you are planning new advanced AI/ML projects, do you have all the data you need for them? If not, what is the effort required to obtain the data? Is it time-sensitive (e.g., ingesting a public dataset is less time-sensitive than starting to collect timestamped proprietary data)?\nThis is the future-oriented version of Q5. It’s best to think of data and metrics before kicking off advanced AI/ML projects. Further, it’s better to start without AI/ML than over-complicate things early on.\nIn short, user needs should inform project decisions on what to build. These decisions and plans then inform data collection. Don’t make the common mistake of starting with shiny tech as the proverbial hammer that’s looking for nail-shaped problems.\nDon’t forget the opportunities! Data is much like solar energy: It exists even if you don’t capture it, and most of it bounces back to space unused. Harnessed wisely, it can power business decisions and become a differentiator for your product.\nHowever, when working closely with data, it’s easy to feel despair due to the never-ending stream of quality issues and stakeholder requests. I’ve felt this despair myself many times.\nFor me, the cure for data despair comes from shifting focus from gaps to opportunities:\nMap the current state of data, including key gaps. Learn about relevant business opportunities from internal/external customers and industry peers. Create a plan to incrementally improve the state of data, and seize opportunities starting with the lowest-hanging fruit. Execute the plan. Repeat steps 1-4 periodically. Data-to-AI health beyond abstract data This post is part of a series on my Data-to-AI Health Check for Startups. Previous posts:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role Probing the People aspects of an early-stage startup Question startup culture before accepting a data-to-AI role Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI How to avoid startups with poor development processes Startup data health starts with healthy event tracking You can download a guide containing all the questions as a PDF. Next, I’ll go into the questions from the Tech section, which are directly related to how the abstract Data questions manifest in practice. Feedback is always welcome!\n","wordCount":"1275","inLanguage":"en","image":"https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster.webp","datePublished":"2024-06-17T02:00:00Z","dateModified":"2024-06-17T13:13:44+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">AI ain't gonna save you from bad data</h1><div class=post-meta><span title='2024-06-17 02:00:00 +0000 UTC'>June 17, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster_huaf81e2929a1616f106c89955474f8bad_39010_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster_huaf81e2929a1616f106c89955474f8bad_39010_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster_huaf81e2929a1616f106c89955474f8bad_39010_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster_huaf81e2929a1616f106c89955474f8bad_39010_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/helpless-robot-with-data-monster.webp alt="bad data monster with a helpless robot" width=1200 height=630></figure><div class=post-content><p>Now that we have generative AI, we no longer need to worry about data, right? Well, we&rsquo;re not quite there yet. On their own, ChatGPT, Gemini, and their friends can&rsquo;t save us from bad decisions around data collection and modelling, or from poorly-designed metrics.</p><p>While we wait for better AI agents to replace data scientists and engineers, I propose we ask a standard set of six questions about the data health of any project. These questions come from <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>, but they apply anywhere. You can use them as a starting point when joining a new initiative, or to assess the state of an existing project.</p><p>Before we start, note that the goal is to identify gaps as opportunities for improvement. It&rsquo;s easy to see data issues as insurmountable, but despair isn&rsquo;t a viable data strategy. Aim to adopt <a href=https://en.wikipedia.org/wiki/James_Stockdale#The_Stockdale_Paradox target=_blank rel=noopener>Stockdale-style optimism</a> when dealing with data:</p><blockquote><p>You must never confuse faith that you will prevail in the end – which you can never afford to lose – with the discipline to confront the most brutal facts of your current reality, whatever they might be.</p></blockquote><p>As with other posts in the health check series, this post provides a brief explanation for every question.</p><p>Let&rsquo;s jump in.</p><h2 id=the-questions>The questions<a hidden class=anchor aria-hidden=true href=#the-questions>#</a></h2><p><strong>Q1: Do you track Kukuyeva&rsquo;s five business aspects as wide events?</strong></p><p>This question is foundational, as inadequate instrumentation often makes data-informed decisions impossible. Given its importance, I wrote <a href=https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/>a separate post on this question</a>.</p><p>The short story is that you <em>need</em> event-based tracking and event data modelling of key aspects of the business and product. Events are essentially timestamped mappings. For example, a customer purchase is an event that should be logged along with metadata on the customer and platform at the time of purchase.</p><p><strong>Q2: Is there data you need that isn&rsquo;t collected or is inaccessible? What is stopping you from obtaining it?</strong></p><p>While Q1 covers high-level event tracking, Q2 brings it down to specific needs.</p><p>Sometimes, data can&rsquo;t be collected due to practical or legal reasons. In some cases, other data can serve as a proxy. For example, companies are typically interested in measuring &ldquo;customer satisfaction&rdquo;, but asking directly about satisfaction suffers from a host of problems (e.g., not everyone responds, and the timing and phrasing of questions influence answers). Instead, customer satisfaction can be partly inferred from behaviour like repeat purchases – but that&rsquo;s also not perfect.</p><p>In any case, seeking perfection in data is a recipe for disappointment. You should start with business needs and find the data to best address them within a reasonable timeframe.</p><p><strong>Q3: On a scale of 1 to 5, rate the quality of your key datasets. If you&rsquo;re unsure due to limited observability and quality checks, it&rsquo;s a 1.</strong></p><p>This question should be answered by those who are closest to the data: usually data engineers, scientists, analysts, or one of the dozens of other titles data specialists go by. An experienced data specialist would have a subjective sense of data quality, so it&rsquo;s worth agreeing on <a href=https://en.wikipedia.org/wiki/Data_quality target=_blank rel=noopener>quality definitions</a> if the rating is done by multiple people. Generally, a dataset is of high quality if it&rsquo;s fit for its intended uses in <a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>decisions and automation</a>.</p><p>Again, absolute perfection is impossible: Data is a model of the world, and <a href=https://en.wikipedia.org/wiki/All_models_are_wrong target=_blank rel=noopener><em>all models are wrong, but some are useful</em></a>.</p><p><strong>Q4: On a scale of 1 to 5, what is the confidence of stakeholders in the data and metrics that are used to make business decisions? Explain why.</strong></p><p>Low data quality often leads to low trust and confidence in the outputs of data specialists. However, that&rsquo;s not always the case. Sometimes, stakeholders may have high confidence in metrics because they&rsquo;re unaware of underlying data issues. In other cases, confidence is low due to historical reasons: Trust takes time to build – it is a trailing indicator of consistently making and keeping promises. Data specialists are sometimes enamoured with fancy tech and tools, neglecting simple wins that are at the foundation of <a href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/>data&rsquo;s hierarchy of needs</a>. After decades of hype (from big data through data science to AI agents), I can see why many people treat anything that falls under Data & AI with suspicion. If you&rsquo;re a data specialist, serving your customers with relevant trustworthy data can be a rare delight.</p><p><strong>Q5: If you are currently using advanced AI/ML, do you have all the data you need for the models to perform as accurately as required?</strong></p><p>By advanced AI/ML, I mean fine-tuning or building machine learning models from scratch. This is distinct from basic AI/ML, which relies on third-party models as black boxes. For example, calling a vision API to extract text from images is basic AI/ML. Training a model on your proprietary image data is advanced AI/ML. The latter requires data of sufficient quality for model accuracy, where sufficient accuracy depends on the application.</p><p>Implicit in this question are satisfactory answers to Q1-Q3: You need to be tracking advanced AI/ML performance in the context where it&rsquo;s used (Q1), have access to the data you need (Q2), and ensure that data quality is fit for advanced AI/ML (Q3). Advanced AI/ML is hard to do well, and failure can erode stakeholder trust (Q4). However, &ldquo;failure&rdquo; depends on expectations – AI/ML models are probabilistic, so setting the right expectations is key. For example, as ChatGPT has shown, it&rsquo;s possible to build a useful consumer product on top of an AI/ML model that is often wrong.</p><p><strong>Q6: If you are planning new advanced AI/ML projects, do you have all the data you need for them? If not, what is the effort required to obtain the data? Is it time-sensitive (e.g., ingesting a public dataset is less time-sensitive than starting to collect timestamped proprietary data)?</strong></p><p>This is the future-oriented version of Q5. It&rsquo;s best to think of data and metrics <em>before</em> kicking off advanced AI/ML projects. Further, <a href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/>it&rsquo;s better to start without AI/ML than over-complicate things early on</a>.</p><p>In short, user needs should inform project decisions on what to build. These decisions and plans then inform data collection. Don&rsquo;t make the common mistake of starting with shiny tech as the proverbial hammer that&rsquo;s looking for nail-shaped problems.</p><h2 id=dont-forget-the-opportunities>Don&rsquo;t forget the opportunities!<a hidden class=anchor aria-hidden=true href=#dont-forget-the-opportunities>#</a></h2><p>Data is much like solar energy: It exists even if you don&rsquo;t capture it, and most of it bounces back to space unused. Harnessed wisely, it can power business decisions and become a differentiator for your product.</p><p>However, when working closely with data, it&rsquo;s easy to feel despair due to the never-ending stream of quality issues and stakeholder requests. I&rsquo;ve felt this despair myself many times.</p><p>For me, the cure for data despair comes from shifting focus from gaps to opportunities:</p><ol><li>Map the current state of data, including key gaps.</li><li>Learn about relevant business opportunities from internal/external customers and industry peers.</li><li>Create a plan to incrementally improve the state of data, and seize opportunities starting with the lowest-hanging fruit.</li><li>Execute the plan.</li><li>Repeat steps 1-4 periodically.</li></ol><h2 id=data-to-ai-health-beyond-abstract-data>Data-to-AI health beyond abstract data<a hidden class=anchor aria-hidden=true href=#data-to-ai-health-beyond-abstract-data>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li><li><a href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/>Question startup culture before accepting a data-to-AI role</a></li><li><a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>Plumbing, Decisions, and Automation: De-hyping Data & AI</a></li><li><a href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/>How to avoid startups with poor development processes</a></li><li><a href=https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/>Startup data health starts with healthy event tracking</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. Next, I&rsquo;ll go into the questions from the Tech section, which are directly related to how the abstract Data questions manifest in practice. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on x" href="https://x.com/intent/tweet/?text=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f&amp;hashtags=artificialintelligence%2cdatascience%2cdatastrategy%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f&amp;title=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data&amp;summary=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f&title=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on whatsapp" href="https://api.whatsapp.com/send?text=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on telegram" href="https://telegram.me/share/url?text=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share AI ain't gonna save you from bad data on ycombinator" href="https://news.ycombinator.com/submitlink?t=AI%20ain%27t%20gonna%20save%20you%20from%20bad%20data&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f17%2fai-aint-gonna-save-you-from-bad-data%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/index.html b/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/index.html
index 540e4a7b3..969d05c65 100644
--- a/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/index.html
+++ b/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="analytics,artificial intelligence,data science,data strategy,machine learning,software engineering,startups"><meta name=description content="Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Is your tech stack ready for data-intensive applications?"><meta property="og:description" content="Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/"><meta property="og:image" content="https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack.webp"><meta property="article:section" content="posts"><meta property="article:published_time" content="2024-06-24T02:00:00+00:00"><meta property="article:modified_time" content="2024-06-24T14:12:50+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack.webp"><meta name=twitter:title content="Is your tech stack ready for data-intensive applications?"><meta name=twitter:description content="Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Is your tech stack ready for data-intensive applications?","item":"https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Is your tech stack ready for data-intensive applications?","name":"Is your tech stack ready for data-intensive applications?","description":"Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics.","keywords":["analytics","artificial intelligence","data science","data strategy","machine learning","software engineering","startups"],"articleBody":"Data-intensive projects fail when you treat them like traditional software projects. But they also fail when you don’t apply best practices from software engineering.\nWhy?\nBecause data-intensive systems are made of data, and also made of software. Therefore:\ndata changes can lead to failures; and software changes can lead to failures. In traditional software systems, you fully control the changes. Your software doesn’t change unexpectedly.\nIn data-intensive systems, you cede control to the data. The data changes constantly, and it affects the behaviour of your system.\nTo succeed, you need to manage both the data and software aspects of your systems. This successful management is the essence of the questions from the Tech section of my Data-to-AI Health Check for Startups. This post presents the questions along with guidance on what constitutes healthy answers.\nWhat do I mean by data intensity? For the last few months, I have set my LinkedIn tagline to “helping startups ship data-intensive solutions (AI/ML for climate/nature tech)”. I landed on it after a bit of a struggle with succinctly defining exactly what it is I do.\nThe problem is that after over a decade of “data” roles, I don’t see the field of AI/ML (artificial intelligence and machine learning) as a sanctified sphere that’s separate from real-world data and humans. Further, while business intelligence (aka analytics) is seen by some as less “sexy” than AI/ML, I see it as a different lens of using data to drive business outcomes. Essentially, it all comes down to plumbing, decisions, and automation.\nIn the days of the Big Data hype, much attention was given to the three Vs of data: Volume, Velocity, and Variety – what flows through the plumbing. To me, data intensity goes beyond the three Vs. This is how I define it in the first section of my Data-to-AI Health Check:\nHigh data intensity typically requires low-latency processing of large volumes of data with more than one database server. With high intensity, data processing issues noticeably affect key business metrics.\nThat is, in data-intensive settings, data issues affect decisions and automation in a way that hurts the business.\nA couple of examples may help:\nLow intensity: A dashboard that doesn’t contain any actionable metrics. If the metrics change due to bugs in the data processing, it doesn’t affect decisions. High intensity: An ad-serving platform that personalises ads in real time based on numerous data points. If any model or system breaks, millions of dollars may be lost. In short, the higher the data intensity, the more the flow of data affects the bottom line.\nUnderstanding tech stacks and lifecycles At 15 questions, the Tech section of my Data-to-AI Health Check for Startups is long and deep. To keep this post digestible, I won’t go into every question. Instead, I’ve grouped the questions by theme.\nFirst up, on the tech stacks and lifecycles:\nQ1: Provide an architecture diagram for your tech systems (product and data stacks), including first-party and third-party tools and databases. If a diagram doesn’t exist, an ad hoc drawing would work as well. Q2: Zooming in on data stacks, what tools and pipelines do you use for the data engineering lifecycles (generation, storage, ingestion, transformation, and serving), and downstream uses (analytics, AI/ML, and reverse ETL)? Q3: Zooming in further on the downstream uses of analytics and AI/ML, what systems, processes, and tools do you use to manage their lifecycles (discovery, data preparation, model engineering, deployment, monitoring, and maintenance)? Give specific project examples. Q4: Are there any tech choices you regret? Why? Q5: Are there any new tools you want to introduce to your stack? Why? To an extent, tech stacks and lifecycles follow the Anna Karenina principle: All healthy stacks are alike; each unhealthy stack is unhealthy in its own way.\nBy asking for their descriptions, I’m aiming to uncover gaps and opportunities.\nOften, some gaps are known to the people in charge, but they haven’t been explicitly discussed. This is especially common in startups, where competing priorities and resource constraints require compromising on scope and quality to fuel growth. In addition, it’s impossible for small startups to have all the relevant experts on the founding team, so best practices aren’t followed due to ignorance rather than due to intentional compromises made to move fast. However, a lack of awareness of best practices can often lead to the startup moving too slowly.\nTwo concrete examples:\nmany people outside the data world are unaware of recent advances in tooling for management of data transformations (dbt and its competitors), and practitioners who’ve only built ML models in academia rarely appreciate the complexity of running ML in production (MLOps is much more than ML). Beyond gaps that may be exposed by Q1-Q3, explicitly asking about regrettable and future tech choices (Q4 \u0026 Q5) helps surface evidence of an overreliance on unproven or exotic tech (aka wasted innovation tokens) and an underreliance on proven tech (aka reinvention of wheels). This is especially common with inexperienced operators who are too excited about playing with shiny tools. Use of unproven tech should be reserved to the cases where it confers a competitive advantage (e.g., being first to market with the latest AI advances).\nBasic quality assurance and delivery The next set of questions covers what I consider to be the basics of quality assurance and continuous delivery:\nQ6: How do you test product code and infrastructure setup? How good is the coverage (formally – percentage of statements covered, and conceptually – confidence from 1 to 5 that tests capture faults prior to deployment)? Q7: Do all tests run automatically on every version of the code? Q8: Are deployments done as a single automated step (e.g., push new containers to production when the main branch is updated)? Q9: How faithful are development, testing, and staging environments to the production setup? Are there gaps that can be feasibly addressed? If so, what is stopping you from addressing them? As I’m writing this in 2024, all the tooling exists to set things up with solid testing and deployment processes – and it’s constantly getting easier. The only place where such processes may be skipped is in throwaway prototypes, where testing unnecessarily slows things down.\nBeing a startup is also not an excuse. As Martin Fowler pointed out years ago, the internal quality of software doesn’t incur a cost. That is, by implementing solid systems and processes for automated testing and deployment, teams move faster. Teams that cut corners on internal quality may move faster in the very short term, but typically get overtaken by their higher-internal-quality counterparts within weeks.\nNo startup aims to be around only for a few weeks, so investing in internal quality is key to tech health.\nIn Fowler’s words:\nNeglecting internal quality leads to rapid build up of cruft This cruft slows down feature development Even a great team produces cruft, but by keeping internal quality high, is able to keep it under control High internal quality keeps cruft to a minimum, allowing a team to add features with less effort, time, and cost Unfortunately, some software engineers never learn this lesson. Further, data professionals that don’t have a software background are even less likely to be exposed to the importance of internal quality and how it can be enforced.\nThat said, it’s never too late to learn and improve. This is key to avoiding failure modes that arise in data projects when best practices from software engineering aren’t applied.\nSpecific data-intensive failure modes The next set of questions probes for failure modes that are specific to data-intensive work (data engineering, analytics, AI/ML, etc.):\nQ10: Do you apply the same standards of testing and deploying product code to data? For example, is there untested SQL code hidden in dashboarding tools or the database layer, or is SQL treated like core product code (tracked in source control with isolated testing)? Q11: How are schema changes managed and tested in each data system? Q12: Do you rely on notebooks for production data code? If so, how do you ensure that notebook code meets the same quality standards as core product code (especially around testing and change management)? Q13: Do advanced AI/ML projects meet your performance expectations? If not, do you know how to improve performance without data changes? Data-intensive work is essentially about building models with software:\nRaw data is a model of real-world entities and events, expressed in database schemas (even “schemaless” databases have a schema – it’s just unbounded). Dashboards present models of metrics that originate in raw data, with the goal of informing decisions. AI/ML models are essentially complex data transformations, e.g., from a matrix of pixels to a probability that the image modelled by the pixels is of a cat or a dog. Due to historical and practical reasons, much of this modelling work is done by people with no training in software engineering. While the industry is maturing, Q10-13 often expose gaps. The ideal answer to each question is that all models are fully tested and managed – just like software, but with extra care for the complexity introduced by data.\nMaintaining long-term success Finally, the last two questions cover monitoring and maintenance:\nQ14: On a scale of 1 to 5, how confident are you in detecting and addressing issues in production (including product, infra, data, and ML observability 1.0 \u0026 2.0)? Do you have action plans to increase your level of confidence? Q15: What DevOps, DataOps, and MLOps practices do you follow that weren’t covered above? Are there known gaps and plans to address them? Even if a data-intensive project is considered “done”, it still changes in production due to its dependence on data. The degree of likely change varies by project, but it needs to be actively managed for long-term success.\nData-to-AI health beyond the tech This post is part of a series on my Data-to-AI Health Check for Startups. Previous posts:\nAssessing a startup’s data-to-AI health: Overview and motivation Business questions to ask before taking a startup data role Probing the People aspects of an early-stage startup Question startup culture before accepting a data-to-AI role Plumbing, Decisions, and Automation: De-hyping Data \u0026 AI How to avoid startups with poor development processes Startup data health starts with healthy event tracking AI ain’t gonna save you from bad data You can download a guide containing all the questions as a PDF. Next, I’ll go into the questions from the Security \u0026 Compliance section. Feedback is always welcome!\n","wordCount":"1743","inLanguage":"en","image":"https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack.webp","datePublished":"2024-06-24T02:00:00Z","dateModified":"2024-06-24T14:12:50+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Is your tech stack ready for data-intensive applications?</h1><div class=post-meta><span title='2024-06-24 02:00:00 +0000 UTC'>June 24, 2024</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack_hu581202d4a2bac0f4b81f59adcd992e11_114668_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack_hu581202d4a2bac0f4b81f59adcd992e11_114668_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack_hu581202d4a2bac0f4b81f59adcd992e11_114668_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack_hu581202d4a2bac0f4b81f59adcd992e11_114668_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/modern-tech-stack.webp alt="a stack of computers, wires, and hay in an office area" width=1200 height=630></figure><div class=post-content><p>Data-intensive projects fail when you treat them like traditional software projects. But they also fail when you don&rsquo;t apply best practices from software engineering.</p><p>Why?</p><p>Because data-intensive systems are made of data, and also made of software. Therefore:</p><ol><li>data changes can lead to failures; and</li><li>software changes can lead to failures.</li></ol><p>In traditional software systems, you fully control the changes. Your software doesn&rsquo;t change unexpectedly.</p><p>In data-intensive systems, you cede control to the data. The data changes constantly, and it affects the behaviour of your system.</p><p>To succeed, you need to manage both the data and software aspects of your systems. This successful management is the essence of the questions from the Tech section of <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. This post presents the questions along with guidance on what constitutes healthy answers.</p><h2 id=what-do-i-mean-by-data-intensity>What do I mean by data intensity?<a hidden class=anchor aria-hidden=true href=#what-do-i-mean-by-data-intensity>#</a></h2><p>For the last few months, I have set my LinkedIn tagline to <em>&ldquo;helping startups ship data-intensive solutions (AI/ML for climate/nature tech)&rdquo;</em>. I landed on it after a bit of a struggle with succinctly defining exactly what it is I do.</p><p>The problem is that after over a decade of &ldquo;data&rdquo; roles, I don&rsquo;t see the field of AI/ML (artificial intelligence and machine learning) as a sanctified sphere that&rsquo;s separate from real-world data and humans. Further, while business intelligence (aka analytics) is seen by some as less &ldquo;sexy&rdquo; than AI/ML, I see it as a different lens of using data to drive business outcomes. Essentially, it all comes down to <a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>plumbing, decisions, and automation</a>.</p><p>In the days of the Big Data hype, much attention was given to the three Vs of data: Volume, Velocity, and Variety – what flows through the plumbing. To me, data intensity goes beyond the three Vs. This is how I define it in the first section of my Data-to-AI Health Check:</p><blockquote><p>High data intensity typically requires low-latency processing of large volumes of data with more than one database server. With high intensity, data processing issues noticeably affect key business metrics.</p></blockquote><p>That is, in data-intensive settings, data issues affect decisions and automation in a way that hurts the business.</p><p>A couple of examples may help:</p><ul><li>Low intensity: A dashboard that doesn&rsquo;t contain any actionable metrics. If the metrics change due to bugs in the data processing, it doesn&rsquo;t affect decisions.</li><li>High intensity: An ad-serving platform that personalises ads in real time based on numerous data points. If any model or system breaks, millions of dollars may be lost.</li></ul><p>In short, <strong>the higher the data intensity, the more the flow of data affects the bottom line</strong>.</p><h2 id=understanding-tech-stacks-and-lifecycles>Understanding tech stacks and lifecycles<a hidden class=anchor aria-hidden=true href=#understanding-tech-stacks-and-lifecycles>#</a></h2><p>At 15 questions, the Tech section of <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a> is long and deep. To keep this post digestible, I won&rsquo;t go into every question. Instead, I&rsquo;ve grouped the questions by theme.</p><p>First up, on the tech stacks and lifecycles:</p><blockquote><ul><li>Q1: Provide an architecture diagram for your tech systems (product and data stacks), including first-party and third-party tools and databases. If a diagram doesn&rsquo;t exist, an ad hoc drawing would work as well.</li><li>Q2: Zooming in on data stacks, what tools and pipelines do you use for the <a href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/>data engineering lifecycles (generation, storage, ingestion, transformation, and serving)</a>, and downstream uses (analytics, AI/ML, and reverse ETL)?</li><li>Q3: Zooming in further on the downstream uses of analytics and AI/ML, what systems, processes, and tools do you use to manage their lifecycles (discovery, data preparation, model engineering, deployment, monitoring, and maintenance)? Give specific project examples.</li><li>Q4: Are there any tech choices you regret? Why?</li><li>Q5: Are there any new tools you want to introduce to your stack? Why?</li></ul></blockquote><p>To an extent, tech stacks and lifecycles follow <a href=https://en.wikipedia.org/wiki/Anna_Karenina_principle target=_blank rel=noopener>the Anna Karenina principle</a>: <em>All healthy stacks are alike; each unhealthy stack is unhealthy in its own way.</em></p><p>By asking for their descriptions, I&rsquo;m aiming to uncover gaps and opportunities.</p><p>Often, some gaps are known to the people in charge, but they haven&rsquo;t been explicitly discussed. This is especially common in startups, where competing priorities and resource constraints require compromising on scope and quality to fuel growth. In addition, it&rsquo;s impossible for small startups to have all the relevant experts on the founding team, so best practices aren&rsquo;t followed due to ignorance rather than due to intentional compromises made to move fast. However, a lack of awareness of best practices can often lead to the startup moving too slowly.</p><p>Two concrete examples:</p><ul><li>many people outside the data world are unaware of recent advances in tooling for management of data transformations (<a href=https://www.getdbt.com/product/what-is-dbt target=_blank rel=noopener>dbt</a> and its competitors), and</li><li>practitioners who&rsquo;ve only built ML models in academia rarely appreciate the complexity of running ML in production (<a href=https://en.wikipedia.org/wiki/MLOps target=_blank rel=noopener>MLOps</a> is much more than ML).</li></ul><p>Beyond gaps that may be exposed by Q1-Q3, explicitly asking about regrettable and future tech choices (Q4 & Q5) helps surface evidence of an overreliance on unproven or exotic tech (aka <a href=https://boringtechnology.club/ target=_blank rel=noopener>wasted innovation tokens</a>) and an underreliance on proven tech (aka reinvention of wheels). This is especially common with <a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>inexperienced operators who are too excited about playing with shiny tools</a>. Use of unproven tech should be reserved to the cases where it confers a competitive advantage (e.g., being first to market with the latest AI advances).</p><h2 id=basic-quality-assurance-and-delivery>Basic quality assurance and delivery<a hidden class=anchor aria-hidden=true href=#basic-quality-assurance-and-delivery>#</a></h2><p>The next set of questions covers what I consider to be the basics of quality assurance and continuous delivery:</p><blockquote><ul><li>Q6: How do you test product code and infrastructure setup? How good is the coverage (formally – percentage of statements covered, and conceptually – confidence from 1 to 5 that tests capture faults prior to deployment)?</li><li>Q7: Do all tests run automatically on every version of the code?</li><li>Q8: Are deployments done as a single automated step (e.g., push new containers to production when the main branch is updated)?</li><li>Q9: How faithful are development, testing, and staging environments to the production setup? Are there gaps that can be feasibly addressed? If so, what is stopping you from addressing them?</li></ul></blockquote><p>As I&rsquo;m writing this in 2024, all the tooling exists to set things up with solid testing and deployment processes – and it&rsquo;s constantly getting easier. The only place where such processes may be skipped is in throwaway prototypes, where testing unnecessarily slows things down.</p><p>Being a startup is also not an excuse. <a href=https://martinfowler.com/articles/is-quality-worth-cost.html target=_blank rel=noopener>As Martin Fowler pointed out years ago</a>, the <em>internal</em> quality of software doesn&rsquo;t incur a cost. That is, by implementing solid systems and processes for automated testing and deployment, teams move faster. Teams that cut corners on internal quality may move faster in the <em>very</em> short term, but typically get overtaken by their higher-internal-quality counterparts within weeks.</p><p>No startup aims to be around only for a few weeks, so investing in internal quality is key to tech health.</p><p>In Fowler&rsquo;s words:</p><blockquote><ul><li>Neglecting internal quality leads to rapid build up of cruft</li><li>This cruft slows down feature development</li><li>Even a great team produces cruft, but by keeping internal quality high, is able to keep it under control</li><li>High internal quality keeps cruft to a minimum, allowing a team to add features with less effort, time, and cost</li></ul></blockquote><p>Unfortunately, some software engineers never learn this lesson. Further, data professionals that don&rsquo;t have a software background are even less likely to be exposed to the importance of internal quality and how it can be enforced.</p><p>That said, it&rsquo;s never too late to learn and improve. This is key to avoiding failure modes that arise in data projects when best practices from software engineering aren&rsquo;t applied.</p><h2 id=specific-data-intensive-failure-modes>Specific data-intensive failure modes<a hidden class=anchor aria-hidden=true href=#specific-data-intensive-failure-modes>#</a></h2><p>The next set of questions probes for failure modes that are specific to data-intensive work (data engineering, analytics, AI/ML, etc.):</p><blockquote><ul><li>Q10: Do you apply the same standards of testing and deploying product code to data? For example, is there untested SQL code hidden in dashboarding tools or the database layer, or is SQL treated like core product code (tracked in source control with isolated testing)?</li><li>Q11: How are schema changes managed and tested in each data system?</li><li>Q12: Do you rely on notebooks for production data code? If so, how do you ensure that notebook code meets the same quality standards as core product code (especially around testing and change management)?</li><li>Q13: Do advanced AI/ML projects meet your performance expectations? If not, do you know how to improve performance without data changes?</li></ul></blockquote><p>Data-intensive work is essentially about building models with software:</p><ul><li>Raw data is a model of real-world entities and events, expressed in database schemas (even &ldquo;schemaless&rdquo; databases have a schema – it&rsquo;s just unbounded).</li><li>Dashboards present models of metrics that originate in raw data, with the goal of informing decisions.</li><li>AI/ML models are essentially complex data transformations, e.g., from a matrix of pixels to a probability that the image modelled by the pixels is of a cat or a dog.</li></ul><p>Due to historical and practical reasons, much of this modelling work is done by people with no training in software engineering. While <a href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/>the industry is maturing</a>, Q10-13 often expose gaps. The ideal answer to each question is that all models are fully tested and managed – just like software, but with <a href=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/>extra care for the complexity introduced by data</a>.</p><h2 id=maintaining-long-term-success>Maintaining long-term success<a hidden class=anchor aria-hidden=true href=#maintaining-long-term-success>#</a></h2><p>Finally, the last two questions cover monitoring and maintenance:</p><blockquote><ul><li>Q14: On a scale of 1 to 5, how confident are you in detecting and addressing issues in production (including product, infra, data, and <a href=https://twitter.com/mipsytipsy/status/1738048200630792245 target=_blank rel=noopener>ML observability 1.0 & 2.0</a>)? Do you have action plans to increase your level of confidence?</li><li>Q15: What DevOps, DataOps, and MLOps practices do you follow that weren&rsquo;t covered above? Are there known gaps and plans to address them?</li></ul></blockquote><p>Even if a data-intensive project is considered &ldquo;done&rdquo;, it still changes in production due to its dependence on data. The degree of likely change varies by project, but it needs to be actively managed for long-term success.</p><h2 id=data-to-ai-health-beyond-the-tech>Data-to-AI health beyond the tech<a hidden class=anchor aria-hidden=true href=#data-to-ai-health-beyond-the-tech>#</a></h2><p>This post is part of a series on <a href=https://yanirseroussi.com/data-to-ai-health-check/>my Data-to-AI Health Check for Startups</a>. Previous posts:</p><ul><li><a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>Assessing a startup&rsquo;s data-to-AI health: Overview and motivation</a></li><li><a href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/>Business questions to ask before taking a startup data role</a></li><li><a href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/>Probing the People aspects of an early-stage startup</a></li><li><a href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/>Question startup culture before accepting a data-to-AI role</a></li><li><a href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/>Plumbing, Decisions, and Automation: De-hyping Data & AI</a></li><li><a href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/>How to avoid startups with poor development processes</a></li><li><a href=https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/>Startup data health starts with healthy event tracking</a></li><li><a href=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/>AI ain&rsquo;t gonna save you from bad data</a></li></ul><p><a href=https://yanirseroussi.com/data-to-ai-health-check/>You can download a guide containing all the questions as a PDF</a>. Next, I&rsquo;ll go into the questions from the Security & Compliance section. Feedback is always welcome!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/analytics/>Analytics</a></li><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on x" href="https://x.com/intent/tweet/?text=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f&amp;hashtags=analytics%2cartificialintelligence%2cdatascience%2cdatastrategy%2cmachinelearning%2csoftwareengineering%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f&amp;title=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f&amp;summary=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f&amp;source=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f&title=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on whatsapp" href="https://api.whatsapp.com/send?text=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f%20-%20https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on telegram" href="https://telegram.me/share/url?text=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f&amp;url=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Is your tech stack ready for data-intensive applications? on ycombinator" href="https://news.ycombinator.com/submitlink?t=Is%20your%20tech%20stack%20ready%20for%20data-intensive%20applications%3f&u=https%3a%2f%2fyanirseroussi.com%2f2024%2f06%2f24%2fis-your-tech-stack-ready-for-data-intensive-applications%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/causal-inference-resources/index.html b/causal-inference-resources/index.html
index c0d8bb89a..3d75e7a84 100644
--- a/causal-inference-resources/index.html
+++ b/causal-inference-resources/index.html
@@ -8,7 +8,7 @@
 Causal Inference: What if by Miguel Hernán and Jamie Robins: The most practical book I&rsquo;ve read. Highly recommended. Trustworthy Online Controlled Experiments : A Practical Guide to A/B Testing by Ron Kohavi, Diane Tang, and Ya Xu: Building on the authors&rsquo; decades of industry experience, this is pretty much the bible of online experiments, which is how causal inference is often done in practice."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Causal inference resources","item":"https://yanirseroussi.com/causal-inference-resources/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Causal inference resources","name":"Causal inference resources","description":"This is a list of some causal inference resources, which I update from time to time. You can also check out my posts on causal inference and A/B testing.\nBooks:\nCausal Inference: What if by Miguel Hernán and Jamie Robins: The most practical book I\u0026rsquo;ve read. Highly recommended. Trustworthy Online Controlled Experiments : A Practical Guide to A/B Testing by Ron Kohavi, Diane Tang, and Ya Xu: Building on the authors\u0026rsquo; decades of industry experience, this is pretty much the bible of online experiments, which is how causal inference is often done in practice.","keywords":[],"articleBody":"This is a list of some causal inference resources, which I update from time to time. You can also check out my posts on causal inference and A/B testing.\nBooks:\nCausal Inference: What if by Miguel Hernán and Jamie Robins: The most practical book I’ve read. Highly recommended. Trustworthy Online Controlled Experiments : A Practical Guide to A/B Testing by Ron Kohavi, Diane Tang, and Ya Xu: Building on the authors’ decades of industry experience, this is pretty much the bible of online experiments, which is how causal inference is often done in practice. Why: A Guide to Finding and Using Causes by Samantha Kleinberg: A high-level intro to the topic. I discussed highlights in Why you should stop worrying about deep learning and deepen your understanding of causality instead. Causality, Probability, and Time by Samantha Kleinberg: More technical than Kleinberg’s other book. As the title suggests, the element of time is central to the methods presented in the book. However, I’m still unsure about the practicality of those methods on real data. See my post Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions for more details. Causal Inference in Statistics: A Primer by Judea Pearl, Madelyn Glymour, Nicholas P. Jewell: A fairly accessible introduction to Judea Pearl’s work. I didn’t find it that practical, but I believe it helped me understand the graphical modelling parts of Causal Inference by Hernán and Robins. Elements of Causal Inference: Foundations and Learning Algorithms by Jonas Peters, Dominik Janzing, and Bernhard Schölkopf: The name of the book is an obvious reference to the classic book The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Unfortunately, the Elements of Causal Inference isn’t as widely applicable as Hastie et al.’s book – it contains some interesting ideas, but it appears that algorithms for causal learning from data with minimal assumptions aren’t yet scalable enough for practical use. This will probably change in the future. Mostly Harmless Econometrics by Joshua D. Angrist and Jörn-Steffen Pischke: I started reading this book on my Kindle and was put off by some formatting issues. It also seemed like a less-general version of Pearl’s work. I may get back to it one day. Causality: Models, Reasoning, and Inference by Judea Pearl: I haven’t read it, and I doubt it’d be very practical given the opinions of people who have. But maybe I’ll get to it one day. The Book of Why: The New Science of Cause and Effect by Judea Pearl and Dana Mackenzie: An accessible overview of the field, focusing on Pearl’s contributions, but with plenty of historical background. Worth reading to get excited about the causal revolution. Causal Machine Learning by Robert Osazuwa Ness: Still a draft as of September 2022, but it looks promising. Articles:\nDoes water kill? A call for less casual causal inferences by Miguel Hernán: A great demonstration of why talking about causality requires well-defined interventions. The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data by Miguel Hernán: A high-level summary of causal inference and the need to be explicit about the causal goals of scientific studies. The Environment and Disease: Association or Causation? by Austin Bradford Hill: A classic discussion of the Bradford Hill criteria for causation. Highly recommended, as this 1965 paper also foresaw the problems with the statistical significance cult. Causal inference in statistics: An overview by Judea Pearl: A summary of Pearl’s work, which may be somewhat dated at this point (it’s from 2009). It’s still worth reading if you’re not ready to commit to reading his books. Simpson’s Paradox: An Anatomy by Judea Pearl: An explanation of Simpson’s paradox and its relationship to causal inference. This paper is worth reading, though I found that further reading is required to better understand why causal modelling “solves” the paradox. Guidelines for estimating causal effects in pragmatic randomized trials by Eleanor J. Murray, Sonja A. Swanson, and Miguel A. Hernán. Once you get over the terminology gap, you see how these guidelines apply to any field where experiments don’t always go as planned. Courses:\nCausal Diagrams: Draw Your Assumptions Before Your Conclusions. A high-level introduction to causal diagrams by Miguel Hernán. Highly recommended for those who want to get a conceptual overview of how causal diagrams work and why they’re useful. A/B Testing by Google: Online Experiment Design and Analysis. Experimentation is key to causal inference, with the online world offering an accessible ground for running experiments. This short course is worth doing if you’re involved in online experiments in any way. ","wordCount":"759","inLanguage":"en","datePublished":"0001-01-01T00:00:00Z","dateModified":"2023-07-06T16:01:57+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/causal-inference-resources/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Causal inference resources</h1><div class=post-meta></div></header><div class=post-content><p>This is a list of some causal inference resources, which I update from time to time. You can also check out my posts on <a href=/tags/causal-inference/>causal inference</a> and <a href=/tags/a/b-testing/>A/B testing</a>.</p><p><strong>Books</strong>:</p><ul><li><a href=https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/ target=_blank rel=noopener><em>Causal Inference: What if</em></a> by Miguel Hernán and Jamie Robins: <a href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/>The most practical book I&rsquo;ve read</a>. Highly recommended.</li><li><a href=https://experimentguide.com/ target=_blank rel=noopener><em>Trustworthy Online Controlled Experiments : A Practical Guide to A/B Testing</em></a> by Ron Kohavi, Diane Tang, and Ya Xu: Building on the authors&rsquo; decades of industry experience, this is pretty much the bible of online experiments, which is how causal inference is often done in practice.</li><li><a href=http://www.skleinberg.org/why/ target=_blank rel=noopener><em>Why: A Guide to Finding and Using Causes</em></a> by Samantha Kleinberg: A high-level intro to the topic. I discussed highlights in <a href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/><em>Why you should stop worrying about deep learning and deepen your understanding of causality instead</em></a>.</li><li><a href=http://www.skleinberg.org/causality_book/index.html target=_blank rel=noopener><em>Causality, Probability, and Time</em></a> by Samantha Kleinberg: More technical than Kleinberg&rsquo;s other book. As the title suggests, the element of time is central to the methods presented in the book. However, I&rsquo;m still unsure about the practicality of those methods on real data. See my post <a href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/><em>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</em></a> for more details.</li><li><a href=http://bayes.cs.ucla.edu/PRIMER/ target=_blank rel=noopener><em>Causal Inference in Statistics: A Primer</em></a> by Judea Pearl, Madelyn Glymour, Nicholas P. Jewell: A fairly accessible introduction to Judea Pearl&rsquo;s work. I didn&rsquo;t find it that practical, but I believe it helped me understand the graphical modelling parts of <em>Causal Inference</em> by Hernán and Robins.</li><li><a href=https://mitpress.mit.edu/books/elements-causal-inference target=_blank rel=noopener><em>Elements of Causal Inference: Foundations and Learning Algorithms</em></a> by Jonas Peters, Dominik Janzing, and Bernhard Schölkopf: The name of the book is an obvious reference to the classic book <a href=https://web.stanford.edu/~hastie/ElemStatLearn/ target=_blank rel=noopener><em>The Elements of Statistical Learning</em></a> by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Unfortunately, the <em>Elements of Causal Inference</em> isn&rsquo;t as widely applicable as Hastie et al.&rsquo;s book – it contains some interesting ideas, but it appears that algorithms for causal learning from data with minimal assumptions aren&rsquo;t yet scalable enough for practical use. This will probably change in the future.</li><li><a href=http://www.mostlyharmlesseconometrics.com/ target=_blank rel=noopener><em>Mostly Harmless Econometrics</em></a> by Joshua D. Angrist and Jörn-Steffen Pischke: I started reading this book on my Kindle and was put off by some formatting issues. It also seemed like a less-general version of Pearl&rsquo;s work. I may get back to it one day.</li><li><a href=http://bayes.cs.ucla.edu/BOOK-2K/index.html target=_blank rel=noopener><em>Causality: Models, Reasoning, and Inference</em></a> by Judea Pearl: I haven&rsquo;t read it, and I doubt it&rsquo;d be very practical given <a href=https://www.reddit.com/r/statistics/comments/8lu1sr/causal_inference_book_recommendations/ target=_blank rel=noopener>the opinions of people who have</a>. But maybe I&rsquo;ll get to it one day.</li><li><a href=http://bayes.cs.ucla.edu/WHY/ target=_blank rel=noopener><em>The Book of Why: The New Science of Cause and Effect</em></a> by Judea Pearl and Dana Mackenzie: An accessible overview of the field, focusing on Pearl&rsquo;s contributions, but with plenty of historical background. Worth reading to get excited about the causal revolution.</li><li><a href=https://www.manning.com/books/causal-machine-learning target=_blank rel=noopener><em>Causal Machine Learning</em></a> by Robert Osazuwa Ness: Still a draft as of September 2022, but <a href=https://yanirseroussi.com/2022/09/12/causal-machine-learning-book-draft-review/>it looks promising</a>.</li></ul><p><strong>Articles</strong>:</p><ul><li><a href=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5207342/ target=_blank rel=noopener><em>Does water kill? A call for less casual causal inferences</em></a> by Miguel Hernán: A great demonstration of why talking about causality requires well-defined interventions.</li><li><a href=https://ajph.aphapublications.org/doi/10.2105/AJPH.2018.304337 target=_blank rel=noopener><em>The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data</em></a> by Miguel Hernán: A high-level summary of causal inference and the need to be explicit about the causal goals of scientific studies.</li><li><a href=https://www.edwardtufte.com/tufte/hill target=_blank rel=noopener><em>The Environment and Disease: Association or Causation?</em></a> by Austin Bradford Hill: A classic discussion of <a href=https://en.wikipedia.org/wiki/Bradford_Hill_criteria target=_blank rel=noopener>the Bradford Hill criteria for causation</a>. Highly recommended, as this 1965 paper also foresaw the problems with the statistical significance cult.</li><li><a href=http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf target=_blank rel=noopener><em>Causal inference in statistics: An overview</em></a> by Judea Pearl: A summary of Pearl&rsquo;s work, which may be somewhat dated at this point (it&rsquo;s from 2009). It&rsquo;s still worth reading if you&rsquo;re not ready to commit to reading his books.</li><li><a href=http://bayes.cs.ucla.edu/R264.pdf target=_blank rel=noopener><em>Simpson&rsquo;s Paradox: An Anatomy</em></a> by Judea Pearl: An explanation of <a href=https://en.wikipedia.org/wiki/Simpson%27s_paradox target=_blank rel=noopener>Simpson&rsquo;s paradox</a> and its relationship to causal inference. This paper is worth reading, though I found that further reading is required to better understand why causal modelling &ldquo;solves&rdquo; the paradox.</li><li><a href=https://arxiv.org/abs/1911.06030 target=_blank rel=noopener><em>Guidelines for estimating causal effects in pragmatic randomized trials</em></a> by Eleanor J. Murray, Sonja A. Swanson, and Miguel A. Hernán. Once you get over the terminology gap, you see how these guidelines apply to any field where experiments don&rsquo;t always go as planned.</li></ul><p><strong>Courses</strong>:</p><ul><li><a href=https://www.edx.org/course/causal-diagrams-draw-your-assumptions-before-your target=_blank rel=noopener><em>Causal Diagrams: Draw Your Assumptions Before Your Conclusions</em></a>. A high-level introduction to causal diagrams by Miguel Hernán. Highly recommended for those who want to get a conceptual overview of how causal diagrams work and why they&rsquo;re useful.</li><li><a href=https://www.udacity.com/course/ab-testing--ud257 target=_blank rel=noopener><em>A/B Testing by Google: Online Experiment Design and Analysis</em></a>. Experimentation is key to causal inference, with the online world offering an accessible ground for running experiments. This short course is worth doing if you&rsquo;re involved in online experiments in any way.</li></ul></div><footer class=post-footer><ul class=post-tags></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on x" href="https://x.com/intent/tweet/?text=Causal%20inference%20resources&amp;url=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f&amp;hashtags="><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f&amp;title=Causal%20inference%20resources&amp;summary=Causal%20inference%20resources&amp;source=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f&title=Causal%20inference%20resources"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on whatsapp" href="https://api.whatsapp.com/send?text=Causal%20inference%20resources%20-%20https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on telegram" href="https://telegram.me/share/url?text=Causal%20inference%20resources&amp;url=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Causal inference resources on ycombinator" href="https://news.ycombinator.com/submitlink?t=Causal%20inference%20resources&u=https%3a%2f%2fyanirseroussi.com%2fcausal-inference-resources%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/contact/index.html b/contact/index.html
index a318d1f09..e6f3e7fe4 100644
--- a/contact/index.html
+++ b/contact/index.html
@@ -1,7 +1,7 @@
 <!doctype html><html lang=en dir=auto><head><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots content="index, follow"><title>Stay in touch | Yanir Seroussi | Data & AI for Startup Impact</title>
-<meta name=keywords content><meta name=description content="Contact me or subscribe to the mailing list."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/contact/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/contact/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Stay in touch"><meta property="og:description" content="Contact me or subscribe to the mailing list."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/contact/"><meta property="og:image" content="https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-05-23T15:31:11+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp"><meta name=twitter:title content="Stay in touch"><meta name=twitter:description content="Contact me or subscribe to the mailing list."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Stay in touch","item":"https://yanirseroussi.com/contact/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Stay in touch","name":"Stay in touch","description":"Contact me or subscribe to the mailing list.","keywords":[],"articleBody":"Contact me Feel free to contact me about topics discussed on this website, potential work, or anything else you think I’d find interesting. Contact options:\nSign up for a free fifteen-minute intro call. Book a one-hour Data-to-AI Strategy Consultation. Open a GitHub issue if you spotted a problem with this website. Connect on LinkedIn – please include a note on how you found me. Email me directly. Subscribe to my mailing list To get new posts delivered to your mailbox, enter your email address below and tap Subscribe. You may unsubscribe at any time.\nGet weekly posts in your mailbox Subscribe Alternatively, subscribe to RSS feed. ","wordCount":"106","inLanguage":"en","image":"https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-05-23T15:31:11+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/contact/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Stay in touch</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager src=https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp alt="Logo of Yanir Seroussi's consulting services, depicting a wave and an up-and-to-the-right graph next to his profile picture."></figure><div class=post-content><h2 id=contact-me>Contact me<a hidden class=anchor aria-hidden=true href=#contact-me>#</a></h2><p>Feel free to contact me about topics discussed on this website, potential work, or anything else you think I&rsquo;d find interesting. Contact options:</p><ul><li><a href=/free-intro-call/>Sign up for a free fifteen-minute intro call</a>.</li><li><a href=https://calendly.com/yanir-seroussi/data-to-ai-strategy-consultation target=_blank rel=noopener>Book a one-hour Data-to-AI Strategy Consultation</a>.</li><li><a href=https://github.com/yanirs/yanirseroussi.com/issues target=_blank rel=noopener>Open a GitHub issue</a> if you spotted a problem with this website.</li><li><a href=https://www.linkedin.com/in/yanirseroussi target=_blank rel=noopener>Connect on LinkedIn</a> – please include a note on how you found me.</li><li><a href=mailto:contact@yanirseroussi.com>Email me directly</a>.</li></ul><h2 id=subscribe-to-my-mailing-list>Subscribe to my mailing list<a hidden class=anchor aria-hidden=true href=#subscribe-to-my-mailing-list>#</a></h2><p>To get new posts delivered to your mailbox, enter your email address below and tap Subscribe. You may unsubscribe at any time.</p><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
+<meta name=keywords content><meta name=description content="Contact me or subscribe to the mailing list."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/contact/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/contact/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Stay in touch"><meta property="og:description" content="Contact me or subscribe to the mailing list."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/contact/"><meta property="og:image" content="https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-05-23T15:31:11+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp"><meta name=twitter:title content="Stay in touch"><meta name=twitter:description content="Contact me or subscribe to the mailing list."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Stay in touch","item":"https://yanirseroussi.com/contact/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Stay in touch","name":"Stay in touch","description":"Contact me or subscribe to the mailing list.","keywords":[],"articleBody":"Contact me Feel free to contact me about topics discussed on this website, potential work, or anything else you think I’d find interesting. Contact options:\nSign up for a free fifteen-minute intro call. Book a one-hour Data-to-AI Strategy Consultation. Open a GitHub issue if you spotted a problem with this website. Connect on LinkedIn – please include a note on how you found me. Email me directly. Subscribe to my mailing list To get new posts delivered to your mailbox, enter your email address below and tap Subscribe. You may unsubscribe at any time.\nGet weekly posts in your mailbox Subscribe Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time. ","wordCount":"113","inLanguage":"en","image":"https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-05-23T15:31:11+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/contact/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Stay in touch</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager src=https://yanirseroussi.com/yanir-seroussi-startup-data-and-ai-consultant-banner.webp alt="Logo of Yanir Seroussi's consulting services, depicting a wave and an up-and-to-the-right graph next to his profile picture."></figure><div class=post-content><h2 id=contact-me>Contact me<a hidden class=anchor aria-hidden=true href=#contact-me>#</a></h2><p>Feel free to contact me about topics discussed on this website, potential work, or anything else you think I&rsquo;d find interesting. Contact options:</p><ul><li><a href=/free-intro-call/>Sign up for a free fifteen-minute intro call</a>.</li><li><a href=https://calendly.com/yanir-seroussi/data-to-ai-strategy-consultation target=_blank rel=noopener>Book a one-hour Data-to-AI Strategy Consultation</a>.</li><li><a href=https://github.com/yanirs/yanirseroussi.com/issues target=_blank rel=noopener>Open a GitHub issue</a> if you spotted a problem with this website.</li><li><a href=https://www.linkedin.com/in/yanirseroussi target=_blank rel=noopener>Connect on LinkedIn</a> – please include a note on how you found me.</li><li><a href=mailto:contact@yanirseroussi.com>Email me directly</a>.</li></ul><h2 id=subscribe-to-my-mailing-list>Subscribe to my mailing list<a hidden class=anchor aria-hidden=true href=#subscribe-to-my-mailing-list>#</a></h2><p>To get new posts delivered to your mailbox, enter your email address below and tap Subscribe. You may unsubscribe at any time.</p><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div></div><footer class=post-footer><ul class=post-tags></ul></footer></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div></div><footer class=post-footer><ul class=post-tags></ul></footer></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
       <a href=https://github.com/adityatelange/hugo-PaperMod/ rel=noopener target=_blank>PaperMod</a></span></div></div><script>const menuTrigger=document.querySelector("#menu-trigger"),menuElem=document.querySelector(".menu");menuTrigger.addEventListener("click",function(){menuElem.classList.toggle("hidden")}),document.body.addEventListener("click",function(e){menuTrigger.contains(e.target)||menuElem.classList.add("hidden")})</script><script>let menu=document.getElementById("menu");menu&&(menu.scrollLeft=localStorage.getItem("menu-scroll-position"),menu.onscroll=function(){localStorage.setItem("menu-scroll-position",menu.scrollLeft)}),document.querySelectorAll('a[href^="#"]').forEach(e=>{e.addEventListener("click",function(e){e.preventDefault();var t=this.getAttribute("href").substr(1);window.matchMedia("(prefers-reduced-motion: reduce)").matches?document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView():document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView({behavior:"smooth"}),t==="top"?history.replaceState(null,null," "):history.pushState(null,null,`#${t}`)})})</script><script>document.getElementById("theme-toggle").addEventListener("click",()=>{document.body.className.includes("dark")?(document.body.classList.remove("dark"),localStorage.setItem("pref-theme","light")):(document.body.classList.add("dark"),localStorage.setItem("pref-theme","dark"))})</script></body></html>
\ No newline at end of file
diff --git a/data-to-ai-health-check/index.html b/data-to-ai-health-check/index.html
index 1a5d7b70c..9f5f33470 100644
--- a/data-to-ai-health-check/index.html
+++ b/data-to-ai-health-check/index.html
@@ -1,5 +1,5 @@
 <!doctype html><html lang=en dir=auto><head><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots content="index, follow"><title>Free Guide: Data-to-AI Health Check for Startups | Yanir Seroussi | Data & AI for Startup Impact</title>
-<meta name=keywords content><meta name=description content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/data-to-ai-health-check/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/data-to-ai-health-check/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Free Guide: Data-to-AI Health Check for Startups"><meta property="og:description" content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/data-to-ai-health-check/"><meta property="og:image" content="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-05-22T17:53:56+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp"><meta name=twitter:title content="Free Guide: Data-to-AI Health Check for Startups"><meta name=twitter:description content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Free Guide: Data-to-AI Health Check for Startups","item":"https://yanirseroussi.com/data-to-ai-health-check/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Free Guide: Data-to-AI Health Check for Startups","name":"Free Guide: Data-to-AI Health Check for Startups","description":"Download a free PDF guide that helps you assess a startup\u0026rsquo;s Data-to-AI health by probing eight key areas.","keywords":[],"articleBody":"Are you…\n…a startup leader considering your first AI/ML project? …a data professional thinking of joining a startup? …an investor assessing the viability of a startup’s AI/ML plans? …a consultant helping startups ship data-intensive solutions? If the answer is yes to any of the above, download my free Data-to-AI Health Check for Startups PDF to help guide your next steps.\nStartup failure rates are high.\nData/AI/ML projects often fail to ship.\nIncrease your chances of success with the Data-to-AI Health Check for Startups.\nGet your copy by email today Submit Early feedback from startup gurus \"A symptom of hard-won personal experience ... a solid list.\"\n\"Excellent advice for anybody looking to join an early stage startup.\"\n\"Great questions to ask any entrepreneur!\" In the guide, you’ll find a set of questions to help you assess a startup’s data/AI/ML health, along with the business context in which data/AI/ML projects get shipped. The questions and scoring guidance cover eight areas, resulting in the scorecard shown below.\nData-to-AI Health Check: Scorecard example ","wordCount":"169","inLanguage":"en","image":"https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-05-22T17:53:56+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/data-to-ai-health-check/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Free Guide: Data-to-AI Health Check for Startups</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp alt="Cover page of the free guide: Data-to-AI Health Check for Startups." width=1200 height=630></figure><div class=post-content><p>Are you&mldr;</p><ul><li>&mldr;a startup leader considering your first AI/ML project?</li><li>&mldr;a data professional thinking of joining a startup?</li><li>&mldr;an investor assessing the viability of a startup&rsquo;s AI/ML plans?</li><li>&mldr;a consultant helping startups ship data-intensive solutions?</li></ul><p>If the answer is yes to any of the above, download my free <em>Data-to-AI Health Check for Startups</em> PDF to help guide your next steps.</p><p>Startup failure rates are high.<br>Data/AI/ML projects often fail to ship.<br>Increase your chances of success with the <em>Data-to-AI Health Check for Startups</em>.</p><style>.mailing-list-container{padding:10px}</style><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6554492/subscriptions method=post data-sv-form=6554492 data-uid=26c0fa1a04 data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to get your PDF guide."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get your copy by email today</label>
+<meta name=keywords content><meta name=description content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/data-to-ai-health-check/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/data-to-ai-health-check/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Free Guide: Data-to-AI Health Check for Startups"><meta property="og:description" content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/data-to-ai-health-check/"><meta property="og:image" content="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-06-26T12:57:51+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp"><meta name=twitter:title content="Free Guide: Data-to-AI Health Check for Startups"><meta name=twitter:description content="Download a free PDF guide that helps you assess a startup&rsquo;s Data-to-AI health by probing eight key areas."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Free Guide: Data-to-AI Health Check for Startups","item":"https://yanirseroussi.com/data-to-ai-health-check/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Free Guide: Data-to-AI Health Check for Startups","name":"Free Guide: Data-to-AI Health Check for Startups","description":"Download a free PDF guide that helps you assess a startup\u0026rsquo;s Data-to-AI health by probing eight key areas.","keywords":[],"articleBody":"Are you…\n…a startup leader considering your first AI/ML project? …a data professional thinking of joining a startup? …an investor assessing the viability of a startup’s AI/ML plans? …a consultant helping startups ship data-intensive solutions? If the answer is yes to any of the above, download my free Data-to-AI Health Check for Startups PDF to help guide your next steps.\nStartup failure rates are high.\nData/AI/ML projects often fail to ship.\nIncrease your chances of success with the Data-to-AI Health Check for Startups.\nGet your copy by email today Submit Early feedback from startup gurus \"A symptom of hard-won personal experience ... a solid list.\"\n\"Excellent advice for anybody looking to join an early stage startup.\"\n\"Great questions to ask any entrepreneur!\" In the guide, you’ll find a set of questions to help you assess a startup’s data/AI/ML health, along with the business context in which data/AI/ML projects get shipped. The questions and scoring guidance cover eight areas, resulting in the scorecard shown below.\nData-to-AI Health Check: Scorecard example ","wordCount":"169","inLanguage":"en","image":"https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-06-26T12:57:51+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/data-to-ai-health-check/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Free Guide: Data-to-AI Health Check for Startups</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page_hud3e090681fd9b79f06e54ee6dc5d7b65_28476_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/data-to-ai-health-check/data-to-ai-health-check-for-startups-cover-page.webp alt="Cover page of the free guide: Data-to-AI Health Check for Startups." width=1200 height=630></figure><div class=post-content><p>Are you&mldr;</p><ul><li>&mldr;a startup leader considering your first AI/ML project?</li><li>&mldr;a data professional thinking of joining a startup?</li><li>&mldr;an investor assessing the viability of a startup&rsquo;s AI/ML plans?</li><li>&mldr;a consultant helping startups ship data-intensive solutions?</li></ul><p>If the answer is yes to any of the above, download my free <em>Data-to-AI Health Check for Startups</em> PDF to help guide your next steps.</p><p>Startup failure rates are high.<br>Data/AI/ML projects often fail to ship.<br>Increase your chances of success with the <em>Data-to-AI Health Check for Startups</em>.</p><style>.mailing-list-container{padding:10px}</style><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6554492/subscriptions method=post data-sv-form=6554492 data-uid=26c0fa1a04 data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to get your PDF guide."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get your copy by email today</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email><fieldset data-group=checkboxes group=field type=Custom order=1 save_as=Tag style=display:none><div data-element=tags-checkboxes data-group=checkbox><input class=formkit-checkbox type=checkbox name=tags[] value=5001948 checked></div></fieldset><button data-element=submit>Submit</button></div></div></form></div><hr><p style=text-align:center><small><a href=https://www.linkedin.com/posts/yanirseroussi_if-you-join-a-startup-as-an-early-employee-activity-7193738878082564096-PxOS target=_blank>Early feedback from startup gurus</a></small></p><p style=text-align:center;font-style:italic;font-weight:600>"A symptom of hard-won personal experience ... a solid list."<br>"Excellent advice for anybody looking to join an early stage startup."<br>"Great questions to ask any entrepreneur!"</p><hr><p>In the guide, you&rsquo;ll find a set of questions to help you assess a startup&rsquo;s data/AI/ML health, along with the business context in which data/AI/ML projects get shipped.
 The questions and scoring guidance cover <a href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/>eight areas</a>, resulting in the scorecard shown below.</p><figure><a href=data-to-ai-health-check-for-startups-scorecard-example.webp target=_blank rel=noopener><img sizes="(min-width: 768px) 720px,
 100vw" srcset="https://yanirseroussi.com/data-to-ai-health-check/_hu8f1ae4ce4fc104fe98d8ff9420b39ed1_175570_d4006ea88ed8ecf02121a23523989e2b.webp 360w,
diff --git a/deep-learning-resources/index.html b/deep-learning-resources/index.html
index 8a8880779..a88294cca 100644
--- a/deep-learning-resources/index.html
+++ b/deep-learning-resources/index.html
@@ -5,7 +5,7 @@
 Tutorials and blog posts Convolutional Neural Networks for Visual Recognition Stanford course notes: an excellent resource, very up-to-date and useful, despite still being a work in progress DeepLearning.net&rsquo;s Theano-based tutorials: not as up-to-date as the Stanford course notes, but still a good introduction to some of the theory and general Theano usage Lasagne&rsquo;s documentation and tutorials: still a bit lacking, but good when you know what you&rsquo;re looking for lasagne4newbs: Lasagne&rsquo;s convnet example with richer comments Using convolutional neural nets to detect facial keypoints tutorial: the resource that made me want to use Lasagne Classifying plankton with deep neural networks: an epic post, which I found while looking for Lasagne examples Various Wikipedia pages: a bit disappointing – the above resources are much better Papers Adam: a method for stochastic optimization (Kingma and Ba, 2015): an improvement over SGD with Nesterov momentum, AdaGrad and RMSProp, which I found to be useful in practice Algorithms for Hyper-Parameter Optimization (Bergstra et al."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Deep learning resources","item":"https://yanirseroussi.com/deep-learning-resources/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Deep learning resources","name":"Deep learning resources","description":"This page summarises the deep learning resources I\u0026rsquo;ve consulted in my album cover classification project.\nTutorials and blog posts Convolutional Neural Networks for Visual Recognition Stanford course notes: an excellent resource, very up-to-date and useful, despite still being a work in progress DeepLearning.net\u0026rsquo;s Theano-based tutorials: not as up-to-date as the Stanford course notes, but still a good introduction to some of the theory and general Theano usage Lasagne\u0026rsquo;s documentation and tutorials: still a bit lacking, but good when you know what you\u0026rsquo;re looking for lasagne4newbs: Lasagne\u0026rsquo;s convnet example with richer comments Using convolutional neural nets to detect facial keypoints tutorial: the resource that made me want to use Lasagne Classifying plankton with deep neural networks: an epic post, which I found while looking for Lasagne examples Various Wikipedia pages: a bit disappointing – the above resources are much better Papers Adam: a method for stochastic optimization (Kingma and Ba, 2015): an improvement over SGD with Nesterov momentum, AdaGrad and RMSProp, which I found to be useful in practice Algorithms for Hyper-Parameter Optimization (Bergstra et al.","keywords":[],"articleBody":"This page summarises the deep learning resources I’ve consulted in my album cover classification project.\nTutorials and blog posts Convolutional Neural Networks for Visual Recognition Stanford course notes: an excellent resource, very up-to-date and useful, despite still being a work in progress DeepLearning.net’s Theano-based tutorials: not as up-to-date as the Stanford course notes, but still a good introduction to some of the theory and general Theano usage Lasagne’s documentation and tutorials: still a bit lacking, but good when you know what you’re looking for lasagne4newbs: Lasagne’s convnet example with richer comments Using convolutional neural nets to detect facial keypoints tutorial: the resource that made me want to use Lasagne Classifying plankton with deep neural networks: an epic post, which I found while looking for Lasagne examples Various Wikipedia pages: a bit disappointing – the above resources are much better Papers Adam: a method for stochastic optimization (Kingma and Ba, 2015): an improvement over SGD with Nesterov momentum, AdaGrad and RMSProp, which I found to be useful in practice Algorithms for Hyper-Parameter Optimization (Bergstra et al., 2011): the work behind Hyperopt – pretty useful stuff, not only for deep learning Convolutional Neural Networks at Constrained Time Cost (He and Sun, 2014): interesting experimental work on the tradeoffs between number of filters, filter sizes, and depth – deeper is better (but with diminishing returns); smaller filter sizes are better; delayed subsampling and spatial pyramid pooling are helpful Deep Learning in Neural Networks: An Overview (Schmidhuber, 2014): 88 pages and 888 references (35 content pages) – good for finding references, but a bit hard to follow; not so good for understanding how the various methods work and how to use or implement them Going deeper with convolutions (Szegedy et al., 2014): the GoogLeNet paper – interesting and compelling results, especially given the improvement in performance while reducing computational complexity ImageNet Classification with Deep Convolutional Neural Networks (Krizhevsky et al., 2012): the classic paper that arguably started (or significantly boosted) the recent buzz around deep learning – many interesting ideas; fairly accesible On the importance of initialization and momentum in deep learning (Sutskever et al., 2013): applying Nesterov momentum to deep learning – good read, simple concept, interesting results Random Search for Hyper-Parameter Optimization (Bergstra and Bengio, 2012): very compelling reasoning and experiments showing that random search outperforms grid search in many cases Recognizing Image Style (Karayev et al., 2014): identifying image style, which is similar to album genre – found that using models pretrained on ImageNet yielded the best results in some cases Very deep convolutional networks for large scale image recognition (Simonyan and Zisserman, 2014): VGGNet paper – interesting experiments and architectures – deep and homogeneous Visualizing and Understanding Convolutional Networks (Zeiler and Fergus, 2013): interesting work on visualisation, but I’ll need to apply it to understand it better ","wordCount":"467","inLanguage":"en","datePublished":"2015-07-06T00:38:44Z","dateModified":"2021-11-09T15:38:25+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/deep-learning-resources/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Deep learning resources</h1><div class=post-meta><span title='2015-07-06 00:38:44 +0000 UTC'>July 6, 2015</span></div></header><div class=post-content><p>This page summarises the deep learning resources I&rsquo;ve consulted in <a href=https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/>my album cover classification project</a>.</p><h3 id=tutorials-and-blog-posts>Tutorials and blog posts<a hidden class=anchor aria-hidden=true href=#tutorials-and-blog-posts>#</a></h3><ul><li><a href=http://cs231n.github.io/ target=_blank rel=noopener>Convolutional Neural Networks for Visual Recognition Stanford course notes</a>: an excellent resource, very up-to-date and useful, despite still being a work in progress</li><li><a href=http://deeplearning.net/tutorial/ target=_blank rel=noopener>DeepLearning.net&rsquo;s Theano-based tutorials</a>: not as up-to-date as the Stanford course notes, but still a good introduction to some of the theory and general Theano usage</li><li><a href=http://lasagne.readthedocs.org/en/latest/ target=_blank rel=noopener>Lasagne&rsquo;s documentation and tutorials</a>: still a bit lacking, but good when you know what you&rsquo;re looking for</li><li><a href=https://github.com/enlitic/lasagne4newbs target=_blank rel=noopener>lasagne4newbs</a>: Lasagne&rsquo;s convnet example with richer comments</li><li><a href=http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/ target=_blank rel=noopener>Using convolutional neural nets to detect facial keypoints tutorial</a>: the resource that made me want to use Lasagne</li><li><a href=http://benanne.github.io/2015/03/17/plankton.html target=_blank rel=noopener>Classifying plankton with deep neural networks</a>: an epic post, which I found while looking for Lasagne examples</li><li><a href=https://en.wikipedia.org/wiki/Main_Page target=_blank rel=noopener>Various Wikipedia pages</a>: a bit disappointing – the above resources are much better</li></ul><h3 id=papers>Papers<a hidden class=anchor aria-hidden=true href=#papers>#</a></h3><ul><li><a href=http://arxiv.org/abs/1412.6980 target=_blank rel=noopener>Adam: a method for stochastic optimization (Kingma and Ba, 2015)</a>: an improvement over SGD with Nesterov momentum, AdaGrad and RMSProp, which I found to be useful in practice</li><li><a href=http://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization target=_blank rel=noopener>Algorithms for Hyper-Parameter Optimization (Bergstra et al., 2011)</a>: the work behind <a href=https://github.com/hyperopt/hyperopt target=_blank rel=noopener>Hyperopt</a> – pretty useful stuff, not only for deep learning</li><li><a href=http://arxiv.org/abs/1412.1710 target=_blank rel=noopener>Convolutional Neural Networks at Constrained Time Cost (He and Sun, 2014)</a>: interesting experimental work on the tradeoffs between number of filters, filter sizes, and depth – deeper is better (but with diminishing returns); smaller filter sizes are better; delayed subsampling and spatial pyramid pooling are helpful</li><li><a href=http://arxiv.org/abs/1404.7828 target=_blank rel=noopener>Deep Learning in Neural Networks: An Overview (Schmidhuber, 2014)</a>: 88 pages and 888 references (35 content pages) – good for finding references, but a bit hard to follow; not so good for understanding how the various methods work and how to use or implement them</li><li><a href=http://arxiv.org/abs/1409.4842 target=_blank rel=noopener>Going deeper with convolutions (Szegedy et al., 2014)</a>: the GoogLeNet paper – interesting and compelling results, especially given the improvement in performance while reducing computational complexity</li><li><a href=http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks target=_blank rel=noopener>ImageNet Classification with Deep Convolutional Neural Networks (Krizhevsky et al., 2012)</a>: the classic paper that arguably started (or significantly boosted) the recent buzz around deep learning – many interesting ideas; fairly accesible</li><li><a href=http://www.cs.toronto.edu/~gdahl/papers/momentumNesterovDeepLearning.pdf target=_blank rel=noopener>On the importance of initialization and momentum in deep learning (Sutskever et al., 2013)</a>: applying Nesterov momentum to deep learning – good read, simple concept, interesting results</li><li><a href=http://jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf target=_blank rel=noopener>Random Search for Hyper-Parameter Optimization (Bergstra and Bengio, 2012)</a>: very compelling reasoning and experiments showing that random search outperforms grid search in many cases</li><li><a href=http://sergeykarayev.com/files/1311.3715v3.pdf target=_blank rel=noopener>Recognizing Image Style (Karayev et al., 2014)</a>: identifying image style, which is similar to album genre – found that using models pretrained on ImageNet yielded the best results in some cases</li><li><a href=http://arxiv.org/abs/1409.1556 target=_blank rel=noopener>Very deep convolutional networks for large scale image recognition (Simonyan and Zisserman, 2014)</a>: VGGNet paper – interesting experiments and architectures – deep and homogeneous</li><li><a href=http://arxiv.org/abs/1311.2901 target=_blank rel=noopener>Visualizing and Understanding Convolutional Networks (Zeiler and Fergus, 2013)</a>: interesting work on visualisation, but I&rsquo;ll need to apply it to understand it better</li></ul></div><footer class=post-footer><ul class=post-tags></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on x" href="https://x.com/intent/tweet/?text=Deep%20learning%20resources&amp;url=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f&amp;hashtags="><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f&amp;title=Deep%20learning%20resources&amp;summary=Deep%20learning%20resources&amp;source=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f&title=Deep%20learning%20resources"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on whatsapp" href="https://api.whatsapp.com/send?text=Deep%20learning%20resources%20-%20https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on telegram" href="https://telegram.me/share/url?text=Deep%20learning%20resources&amp;url=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Deep learning resources on ycombinator" href="https://news.ycombinator.com/submitlink?t=Deep%20learning%20resources&u=https%3a%2f%2fyanirseroussi.com%2fdeep-learning-resources%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/free-intro-call/index.html b/free-intro-call/index.html
index 9ff38dd39..a55ac71d2 100644
--- a/free-intro-call/index.html
+++ b/free-intro-call/index.html
@@ -1,5 +1,5 @@
 <!doctype html><html lang=en dir=auto><head><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><meta name=viewport content="width=device-width,initial-scale=1,shrink-to-fit=no"><meta name=robots content="index, follow"><title>Book a free fifteen-minute call | Yanir Seroussi | Data & AI for Startup Impact</title>
-<meta name=keywords content><meta name=description content="Booking form for a quick intro call with Yanir Seroussi."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/free-intro-call/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/free-intro-call/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Book a free fifteen-minute call"><meta property="og:description" content="Booking form for a quick intro call with Yanir Seroussi."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/free-intro-call/"><meta property="og:image" content="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-05-22T17:54:36+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp"><meta name=twitter:title content="Book a free fifteen-minute call"><meta name=twitter:description content="Booking form for a quick intro call with Yanir Seroussi."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Book a free fifteen-minute call","item":"https://yanirseroussi.com/free-intro-call/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Book a free fifteen-minute call","name":"Book a free fifteen-minute call","description":"Booking form for a quick intro call with Yanir Seroussi.","keywords":[],"articleBody":"Hello! 🐳\nSo… you’ve been poking around my website and LinkedIn profile, and you think we’d benefit from a quick intro call.\nWell, you’ve come to the right place!\nBut first, note that there are other ways to contact me that may be more effective.\nSecond, I also offer longer one-hour advisory calls. You get to set the agenda for such calls, but the focus there is typically on your Data-to-AI Strategy.\nThird, nothing is truly free. Filling the form below will subscribe you to my weekly mailing list. If you made it here, you probably want that anyway (you can always unsubscribe).\nOnce you submit the form, you’ll receive a confirmation email with the booking link for the call.\nRegister for a free fifteen-minute call Register ","wordCount":"127","inLanguage":"en","image":"https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-05-22T17:54:36+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/free-intro-call/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Book a free fifteen-minute call</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp alt="Yanir Seroussi having a call with a mystery person" width=1200 height=630></figure><div class=post-content><p>Hello! 🐳</p><p>So&mldr; you&rsquo;ve been poking around my website and <a href=https://www.linkedin.com/in/yanirseroussi/ target=_blank rel=noopener>LinkedIn profile</a>, and you think we&rsquo;d benefit from a quick intro call.</p><p>Well, you&rsquo;ve come to the right place!</p><p>But first, note that there are <a href=/contact/>other ways to contact me</a> that may be more effective.</p><p>Second, I also offer <a href=https://calendly.com/yanir-seroussi/data-to-ai-strategy-consultation target=_blank rel=noopener>longer one-hour advisory calls</a>. You get to set the agenda for such calls, but the focus there is typically on your Data-to-AI Strategy.</p><p>Third, nothing is truly free. Filling the form below will subscribe you to my weekly mailing list. If you made it here, you probably want that anyway (you can always unsubscribe).</p><p>Once you submit the form, you&rsquo;ll receive a confirmation email with the booking link for the call.</p><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6599419/subscriptions method=post data-sv-form=6599419 data-uid=ecc59764d0 data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email for booking details."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Register for a free fifteen-minute call</label>
+<meta name=keywords content><meta name=description content="Booking form for a quick intro call with Yanir Seroussi."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/free-intro-call/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/free-intro-call/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Book a free fifteen-minute call"><meta property="og:description" content="Booking form for a quick intro call with Yanir Seroussi."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/free-intro-call/"><meta property="og:image" content="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp"><meta property="article:section" content><meta property="article:modified_time" content="2024-06-26T12:57:51+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp"><meta name=twitter:title content="Book a free fifteen-minute call"><meta name=twitter:description content="Booking form for a quick intro call with Yanir Seroussi."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Book a free fifteen-minute call","item":"https://yanirseroussi.com/free-intro-call/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Book a free fifteen-minute call","name":"Book a free fifteen-minute call","description":"Booking form for a quick intro call with Yanir Seroussi.","keywords":[],"articleBody":"Hello! 🐳\nSo… you’ve been poking around my website and LinkedIn profile, and you think we’d benefit from a quick intro call.\nWell, you’ve come to the right place!\nBut first, note that there are other ways to contact me that may be more effective.\nSecond, I also offer longer one-hour advisory calls. You get to set the agenda for such calls, but the focus there is typically on your Data-to-AI Strategy.\nThird, nothing is truly free. Filling the form below will subscribe you to my weekly mailing list. If you made it here, you probably want that anyway (you can always unsubscribe).\nOnce you submit the form, you’ll receive a confirmation email with the booking link for the call.\nRegister for a free fifteen-minute call Register ","wordCount":"127","inLanguage":"en","image":"https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp","datePublished":"0001-01-01T00:00:00Z","dateModified":"2024-06-26T12:57:51+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/free-intro-call/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Book a free fifteen-minute call</h1><div class=post-meta></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_360x0_resize_q75_h2_box_2.webp 360w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_480x0_resize_q75_h2_box_2.webp 480w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_720x0_resize_q75_h2_box_2.webp 720w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call_hub0662ac613f149ce6567b8864db13ac7_22906_1080x0_resize_q75_h2_box_2.webp 1080w ,https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp 1200w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/free-intro-call/yanir-seroussi-intro-call.webp alt="Yanir Seroussi having a call with a mystery person" width=1200 height=630></figure><div class=post-content><p>Hello! 🐳</p><p>So&mldr; you&rsquo;ve been poking around my website and <a href=https://www.linkedin.com/in/yanirseroussi/ target=_blank rel=noopener>LinkedIn profile</a>, and you think we&rsquo;d benefit from a quick intro call.</p><p>Well, you&rsquo;ve come to the right place!</p><p>But first, note that there are <a href=/contact/>other ways to contact me</a> that may be more effective.</p><p>Second, I also offer <a href=https://calendly.com/yanir-seroussi/data-to-ai-strategy-consultation target=_blank rel=noopener>longer one-hour advisory calls</a>. You get to set the agenda for such calls, but the focus there is typically on your Data-to-AI Strategy.</p><p>Third, nothing is truly free. Filling the form below will subscribe you to my weekly mailing list. If you made it here, you probably want that anyway (you can always unsubscribe).</p><p>Once you submit the form, you&rsquo;ll receive a confirmation email with the booking link for the call.</p><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6599419/subscriptions method=post data-sv-form=6599419 data-uid=ecc59764d0 data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email for booking details."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Register for a free fifteen-minute call</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email><fieldset data-group=checkboxes group=field type=Custom order=1 save_as=Tag style=display:none><div data-element=tags-checkboxes data-group=checkbox><input class=formkit-checkbox type=checkbox name=tags[] value=5031249 checked></div></fieldset><button data-element=submit>Register</button></div></div></form></div></div><footer class=post-footer><ul class=post-tags></ul></footer></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/kaggle/index.html b/kaggle/index.html
index 444624cc1..e4fc707a9 100644
--- a/kaggle/index.html
+++ b/kaggle/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data science,Kaggle,Kaggle competition,predictive modelling"><meta name=description content="Pointers to all my Kaggle advice posts and competition summaries."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/kaggle/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/kaggle/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Kaggle competition tips and summaries"><meta property="og:description" content="Pointers to all my Kaggle advice posts and competition summaries."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/kaggle/"><meta property="og:image" content="https://yanirseroussi.com/kaggle/kaggle-logo-transparent.png"><meta property="article:section" content="posts"><meta property="article:published_time" content="2014-04-05T23:46:10+00:00"><meta property="article:modified_time" content="2024-01-16T09:56:03+10:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="https://yanirseroussi.com/kaggle/kaggle-logo-transparent.png"><meta name=twitter:title content="Kaggle competition tips and summaries"><meta name=twitter:description content="Pointers to all my Kaggle advice posts and competition summaries."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"},{"@type":"ListItem","position":2,"name":"Kaggle competition tips and summaries","item":"https://yanirseroussi.com/kaggle/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Kaggle competition tips and summaries","name":"Kaggle competition tips and summaries","description":"Pointers to all my Kaggle advice posts and competition summaries.","keywords":["data science","Kaggle","Kaggle competition","predictive modelling"],"articleBody":"Over the years, I’ve participated in a few Kaggle competitions and wrote a bit about my experiences. This page contains pointers to all my posts, and will be updated if/when I participate in more competitions.\nGeneral advice posts 10 Steps to Success in Kaggle Data Science Competitions (guest post on KDNuggets) How to (almost) win Kaggle competitions Kaggle beginner tips Solution posts Greek Media Monitoring Multilabel Classification [6th/120] – multi-label classification of pre-tokenised texts Personalised Web Search Challenge [9th/194] – reranking web search results in a personalised manner Blue Book for Bulldozers [9th/476] – forecasting auction sale price of bulldozers ICFHR 2012 – Arabic Writer Identification Competition [3rd/42] – classifying handwritten texts by the identity of the writer (Kaggle blog post) EMC Data Science Global Hackathon (Air Quality Prediction) [6th/110] – forecasting levels of air pollutants (Kaggle forum post) ","wordCount":"139","inLanguage":"en","image":"https://yanirseroussi.com/kaggle/kaggle-logo-transparent.png","datePublished":"2014-04-05T23:46:10Z","dateModified":"2024-01-16T09:56:03+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/kaggle/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><h1 class="post-title entry-hint-parent">Kaggle competition tips and summaries</h1><div class=post-meta><span title='2014-04-05 23:46:10 +0000 UTC'>April 5, 2014</span></div></header><figure class=entry-cover><img loading=eager srcset="https://yanirseroussi.com/kaggle/kaggle-logo-transparent_hud5a5728fe9c376b674017f410efda607_7282_360x0_resize_box_3.png 360w ,https://yanirseroussi.com/kaggle/kaggle-logo-transparent_hud5a5728fe9c376b674017f410efda607_7282_480x0_resize_box_3.png 480w ,https://yanirseroussi.com/kaggle/kaggle-logo-transparent_hud5a5728fe9c376b674017f410efda607_7282_720x0_resize_box_3.png 720w ,https://yanirseroussi.com/kaggle/kaggle-logo-transparent.png 1056w" sizes="(min-width: 768px) 720px, 100vw" src=https://yanirseroussi.com/kaggle/kaggle-logo-transparent.png alt width=1056 height=480></figure><div class=post-content><p>Over the years, I&rsquo;ve participated in a few <a href=https://www.kaggle.com target=_blank rel=noopener>Kaggle</a> competitions and wrote a bit about my experiences. This page contains pointers to all my posts, and will be updated if/when I participate in more competitions.</p><h3 id=general-advice-posts>General advice posts<a hidden class=anchor aria-hidden=true href=#general-advice-posts>#</a></h3><ul><li><a href=http://www.kdnuggets.com/2015/03/10-steps-success-kaggle-data-science-competitions.html target=_blank rel=noopener>10 Steps to Success in Kaggle Data Science Competitions (guest post on KDNuggets)</a></li><li><a href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/ target=_blank rel=noopener>How to (almost) win Kaggle competitions</a></li><li><a href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/ target=_blank rel=noopener>Kaggle beginner tips</a></li></ul><h3 id=solution-posts>Solution posts<a hidden class=anchor aria-hidden=true href=#solution-posts>#</a></h3><ul><li><a href=https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/>Greek Media Monitoring Multilabel Classification</a> [6th/120] – multi-label classification of pre-tokenised texts</li><li><a href=https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/ title="Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)">Personalised Web Search Challenge</a> [9th/194] – reranking web search results in a personalised manner</li><li><a href=https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/ title="Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)">Blue Book for Bulldozers</a> [9th/476] – forecasting auction sale price of bulldozers</li><li><a href=http://blog.kaggle.com/2012/04/29/on-diffusion-kernels-histograms-and-arabic-writer-identification/ target=_blank rel=noopener>ICFHR 2012 – Arabic Writer Identification Competition</a> [3rd/42] – classifying handwritten texts by the identity of the writer (Kaggle blog post)</li><li><a href=https://www.kaggle.com/c/dsg-hackathon/forums/t/1821/general-approaches-to-partitioning-the-models/10631#post10631 target=_blank rel=noopener>EMC Data Science Global Hackathon (Air Quality Prediction)</a> [6th/110] – forecasting levels of air pollutants (Kaggle forum post)</li></ul></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/kaggle/>Kaggle</a></li><li><a href=https://yanirseroussi.com/tags/kaggle-competition/>Kaggle Competition</a></li><li><a href=https://yanirseroussi.com/tags/predictive-modelling/>Predictive Modelling</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on x" href="https://x.com/intent/tweet/?text=Kaggle%20competition%20tips%20and%20summaries&amp;url=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f&amp;hashtags=datascience%2cKaggle%2cKagglecompetition%2cpredictivemodelling"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f&amp;title=Kaggle%20competition%20tips%20and%20summaries&amp;summary=Kaggle%20competition%20tips%20and%20summaries&amp;source=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f&title=Kaggle%20competition%20tips%20and%20summaries"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on whatsapp" href="https://api.whatsapp.com/send?text=Kaggle%20competition%20tips%20and%20summaries%20-%20https%3a%2f%2fyanirseroussi.com%2fkaggle%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on telegram" href="https://telegram.me/share/url?text=Kaggle%20competition%20tips%20and%20summaries&amp;url=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Kaggle competition tips and summaries on ycombinator" href="https://news.ycombinator.com/submitlink?t=Kaggle%20competition%20tips%20and%20summaries&u=https%3a%2f%2fyanirseroussi.com%2fkaggle%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/posts/index.html b/posts/index.html
index a23ab9042..f31177e64 100644
--- a/posts/index.html
+++ b/posts/index.html
@@ -11,7 +11,7 @@
 "><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Browse Posts","item":"https://yanirseroussi.com/posts/"}]}</script></head><body class=list id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span class=active>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><header class=page-header><h1>Browse Posts</h1><div class=post-description>Browse my main posts in reverse chronological order below, or <a href=/tags/>by tag</a>. You may also want to check out my
 shorter-form <a href=/til/>TIL (today I learned) posts</a>.<div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div></div></header><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is your tech stack ready for data-intensive applications?</h2></header><div class=entry-content><p>Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics.</p></div><footer class=entry-footer><span title='2024-06-24 02:00:00 +0000 UTC'>June 24, 2024</span></footer><a class=entry-link aria-label="post link to Is your tech stack ready for data-intensive applications?" href=https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>AI ain't gonna save you from bad data</h2></header><div class=entry-content><p>Since we’re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects.</p></div><footer class=entry-footer><span title='2024-06-17 02:00:00 +0000 UTC'>June 17, 2024</span></footer><a class=entry-link aria-label="post link to AI ain't gonna save you from bad data" href=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Startup data health starts with healthy event tracking</h2></header><div class=entry-content><p>Expanding on the startup health check question of tracking Kukuyeva’s five business aspects as wide events.</p></div><footer class=entry-footer><span title='2024-06-10 04:00:00 +0000 UTC'>June 10, 2024</span></footer><a class=entry-link aria-label="post link to Startup data health starts with healthy event tracking" href=https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How to avoid startups with poor development processes</h2></header><div class=entry-content><p>Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role.</p></div><footer class=entry-footer><span title='2024-06-03 02:45:00 +0000 UTC'>June 3, 2024</span></footer><a class=entry-link aria-label="post link to How to avoid startups with poor development processes" href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Plumbing, Decisions, and Automation: De-hyping Data & AI</h2></header><div class=entry-content><p>Three essential questions to understand where an organisation stands when it comes to Data & AI (with zero hype).</p></div><footer class=entry-footer><span title='2024-05-27 02:00:00 +0000 UTC'>May 27, 2024</span></footer><a class=entry-link aria-label="post link to Plumbing, Decisions, and Automation: De-hyping Data & AI" href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Question startup culture before accepting a data-to-AI role</h2></header><div class=entry-content><p>Eight questions that prospective data-to-AI employees should ask about a startup’s work and data culture.</p></div><footer class=entry-footer><span title='2024-05-20 02:25:00 +0000 UTC'>May 20, 2024</span></footer><a class=entry-link aria-label="post link to Question startup culture before accepting a data-to-AI role" href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Probing the People aspects of an early-stage startup</h2></header><div class=entry-content><p>Ten questions that prospective employees should ask about a startup’s team, especially for data-centric roles.</p></div><footer class=entry-footer><span title='2024-05-13 02:00:00 +0000 UTC'>May 13, 2024</span></footer><a class=entry-link aria-label="post link to Probing the People aspects of an early-stage startup" href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Business questions to ask before taking a startup data role</h2></header><div class=entry-content><p>Fourteen questions that prospective employees should ask about a startup’s business model and product, especially for data-focused roles.</p></div><footer class=entry-footer><span title='2024-05-06 04:30:00 +0000 UTC'>May 6, 2024</span></footer><a class=entry-link aria-label="post link to Business questions to ask before taking a startup data role" href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Mentorship and the art of actionable advice</h2></header><div class=entry-content><p>Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships.</p></div><footer class=entry-footer><span title='2024-04-29 06:30:00 +0000 UTC'>April 29, 2024</span></footer><a class=entry-link aria-label="post link to Mentorship and the art of actionable advice" href=https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Assessing a startup's data-to-AI health</h2></header><div class=entry-content><p>Reviewing the areas that should be assessed to determine a startup’s opportunities and challenges on the data/AI/ML front.</p></div><footer class=entry-footer><span title='2024-04-22 06:00:00 +0000 UTC'>April 22, 2024</span></footer><a class=entry-link aria-label="post link to Assessing a startup's data-to-AI health" href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>AI does not obviate the need for testing and observability</h2></header><div class=entry-content><p>It’s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software.</p></div><footer class=entry-footer><span title='2024-04-15 05:00:00 +0000 UTC'>April 15, 2024</span></footer><a class=entry-link aria-label="post link to AI does not obviate the need for testing and observability" href=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My experience as a Data Tech Lead with Work on Climate</h2></header><div class=entry-content><p>The story of how I joined Work on Climate as a volunteer and became its data tech lead, with lessons applied to consulting & fractional work.</p></div><footer class=entry-footer><span title='2024-04-08 02:00:00 +0000 UTC'>April 8, 2024</span></footer><a class=entry-link aria-label="post link to My experience as a Data Tech Lead with Work on Climate" href=https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Artificial intelligence, automation, and the art of counting fish</h2></header><div class=entry-content><p>Discussing the use of AI to automate underwater marine surveys as an example of the uneven distribution of technological advancement.</p></div><footer class=entry-footer><span title='2024-04-01 06:00:00 +0000 UTC'>April 1, 2024</span></footer><a class=entry-link aria-label="post link to Artificial intelligence, automation, and the art of counting fish" href=https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Questions to consider when using AI for PDF data extraction</h2></header><div class=entry-content><p>Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents.</p></div><footer class=entry-footer><span title='2024-03-11 00:00:00 +0000 UTC'>March 11, 2024</span></footer><a class=entry-link aria-label="post link to Questions to consider when using AI for PDF data extraction" href=https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Two types of startup data problems</h2></header><div class=entry-content><p>Classifying startups as ML-centric or non-ML is a helpful exercise to uncover the data challenges they’re likely to face.</p></div><footer class=entry-footer><span title='2024-03-04 02:00:00 +0000 UTC'>March 4, 2024</span></footer><a class=entry-link aria-label="post link to Two types of startup data problems" href=https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Avoiding AI complexity: First, write no code</h2></header><div class=entry-content><p>Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach.</p></div><footer class=entry-footer><span title='2024-02-26 01:45:00 +0000 UTC'>February 26, 2024</span></footer><a class=entry-link aria-label="post link to Avoiding AI complexity: First, write no code" href=https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building your startup's minimum viable data stack</h2></header><div class=entry-content><p>First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations.</p></div><footer class=entry-footer><span title='2024-02-19 00:00:00 +0000 UTC'>February 19, 2024</span></footer><a class=entry-link aria-label="post link to Building your startup's minimum viable data stack" href=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Nudging ChatGPT to invent books you have no time to read</h2></header><div class=entry-content><p>Getting ChatGPT Plus to elaborate on possible book content and produce a PDF cheatsheet, with the goal of learning about its capabilities.</p></div><footer class=entry-footer><span title='2024-02-12 05:00:00 +0000 UTC'>February 12, 2024</span></footer><a class=entry-link aria-label="post link to Nudging ChatGPT to invent books you have no time to read" href=https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Substance over titles: Your first data hire may be a data scientist</h2></header><div class=entry-content><p>Advice for hiring a startup’s first data person: match skills to business needs, consider contractors, and get help from data people.</p></div><footer class=entry-footer><span title='2024-02-05 02:45:00 +0000 UTC'>February 5, 2024</span></footer><a class=entry-link aria-label="post link to Substance over titles: Your first data hire may be a data scientist" href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>New decade, new tagline: Data & AI for Impact</h2></header><div class=entry-content><p>Shifting focus to ‘Data & AI for Impact’, with more startup-related content, increased posting frequency, and deeper audience engagement.</p></div><footer class=entry-footer><span title='2024-01-19 00:00:00 +0000 UTC'>January 19, 2024</span></footer><a class=entry-link aria-label="post link to New decade, new tagline: Data & AI for Impact" href=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Supporting volunteer monitoring of marine biodiversity with modern web and data tools</h2></header><div class=entry-content><p>Summarising the work Uri Seroussi and I did to improve Reef Life Survey’s Reef Species of the World app.</p></div><footer class=entry-footer><span title='2023-11-29 02:00:00 +0000 UTC'>November 29, 2023</span></footer><a class=entry-link aria-label="post link to Supporting volunteer monitoring of marine biodiversity with modern web and data tools" href=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Lessons from reluctant data engineering</h2></header><div class=entry-content><p>Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had.</p></div><footer class=entry-footer><span title='2023-10-25 04:45:00 +0000 UTC'>October 25, 2023</span></footer><a class=entry-link aria-label="post link to Lessons from reluctant data engineering" href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My rediscovery of quiet writing on the open web</h2></header><div class=entry-content><p>Reflections on publishing on this website: Writing publicly to share thoughts and documentation beats chasing views and likes.</p></div><footer class=entry-footer><span title='2023-08-28 05:30:00 +0000 UTC'>August 28, 2023</span></footer><a class=entry-link aria-label="post link to My rediscovery of quiet writing on the open web" href=https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Was data science a failure mode of software engineering?</h2></header><div class=entry-content><p>Yes, data science projects have suffered from classic software engineering mistakes, but the field is maturing with the rise of new engineering roles.</p></div><footer class=entry-footer><span title='2023-06-30 00:06:30 +0000 UTC'>June 30, 2023</span></footer><a class=entry-link aria-label="post link to Was data science a failure mode of software engineering?" href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How hackable are automated coding assessments?</h2></header><div class=entry-content><p>Exploring the hackability of speed-based coding tests, using CodeSignal’s Industry Coding Framework as a case study.</p></div><footer class=entry-footer><span title='2023-05-26 00:03:00 +0000 UTC'>May 26, 2023</span></footer><a class=entry-link aria-label="post link to How hackable are automated coding assessments?" href=https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Remaining relevant as a small language model</h2></header><div class=entry-content><p>Bing Chat recently quipped that humans are small language models. Here are some of my thoughts on how we small language models can remain relevant (for now).</p></div><footer class=entry-footer><span title='2023-04-21 00:06:30 +0000 UTC'>April 21, 2023</span></footer><a class=entry-link aria-label="post link to Remaining relevant as a small language model" href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>ChatGPT is transformative AI</h2></header><div class=entry-content><p>My perspective after a week of using ChatGPT: This is a step change in finding distilled information, and it’s only the beginning.</p></div><footer class=entry-footer><span title='2022-12-11 00:00:00 +0000 UTC'>December 11, 2022</span></footer><a class=entry-link aria-label="post link to ChatGPT is transformative AI" href=https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Causal Machine Learning is off to a good start, despite some issues</h2></header><div class=entry-content><p>Reviewing the first three chapters of the book Causal Machine Learning by Robert Osazuwa Ness.</p></div><footer class=entry-footer><span title='2022-09-12 02:45:00 +0000 UTC'>September 12, 2022</span></footer><a class=entry-link aria-label="post link to Causal Machine Learning is off to a good start, despite some issues" href=https://yanirseroussi.com/2022/09/12/causal-machine-learning-book-draft-review/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The mission matters: Moving to climate tech as a data scientist</h2></header><div class=entry-content><p>Discussing my recent career move into climate tech as a way of doing more to help mitigate dangerous climate change.</p></div><footer class=entry-footer><span title='2022-06-06 00:00:00 +0000 UTC'>June 6, 2022</span></footer><a class=entry-link aria-label="post link to The mission matters: Moving to climate tech as a data scientist" href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building useful machine learning tools keeps getting easier: A fish ID case study</h2></header><div class=entry-content><p>Lessons learned building a fish ID web app with fast.ai and Streamlit, in an attempt to reduce my fear of missing out on the latest deep learning developments.</p></div><footer class=entry-footer><span title='2022-03-20 04:30:00 +0000 UTC'>March 20, 2022</span></footer><a class=entry-link aria-label="post link to Building useful machine learning tools keeps getting easier: A fish ID case study" href=https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials</h2></header><div class=entry-content><p>Epidemiologists analyse clinical trials to estimate the intention-to-treat and per-protocol effects. This post applies their strategies to online experiments.</p></div><footer class=entry-footer><span title='2022-01-14 00:05:40 +0000 UTC'>January 14, 2022</span></footer><a class=entry-link aria-label="post link to Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials" href=https://yanirseroussi.com/2022/01/14/analysis-strategies-in-online-a-b-experiments/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Use your human brain to avoid artificial intelligence disasters</h2></header><div class=entry-content><p>Overview of a talk I gave at a deep learning course, focusing on AI ethics as the need for humans to think on the context and consequences of applying AI.</p></div><footer class=entry-footer><span title='2021-11-22 03:45:00 +0000 UTC'>November 22, 2021</span></footer><a class=entry-link aria-label="post link to Use your human brain to avoid artificial intelligence disasters" href=https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Migrating from WordPress.com to Hugo on GitHub + Cloudflare</h2></header><div class=entry-content><p>My reasons for switching from WordPress.com to Hugo on GitHub + Cloudflare, along with a summary of the solution components and migration process.</p></div><footer class=entry-footer><span title='2021-11-10 06:30:00 +0000 UTC'>November 10, 2021</span></footer><a class=entry-link aria-label="post link to Migrating from WordPress.com to Hugo on GitHub + Cloudflare" href=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My work with Automattic</h2></header><div class=entry-content><p>Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I’ve done with the company.</p></div><footer class=entry-footer><span title='2021-10-07 00:00:00 +0000 UTC'>October 7, 2021</span></footer><a class=entry-link aria-label="post link to My work with Automattic" href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Some highlights from 2020</h2></header><div class=entry-content><p>Sharing remote teamwork insights, my climate & sustainability activism, Reef Life Survey publications, and progress on Automattic’s Experimentation Platform.</p></div><footer class=entry-footer><span title='2021-04-05 06:41:48 +0000 UTC'>April 5, 2021</span></footer><a class=entry-link aria-label="post link to Some highlights from 2020" href=https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Many is not enough: Counting simulations to bootstrap the right way</h2></header><div class=entry-content><p>Going deeper into correct testing of different methods for bootstrap estimation of confidence intervals.</p></div><footer class=entry-footer><span title='2020-08-24 01:35:17 +0000 UTC'>August 24, 2020</span></footer><a class=entry-link aria-label="post link to Many is not enough: Counting simulations to bootstrap the right way" href=https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Software commodities are eating interesting data science work</h2></header><div class=entry-content><p>Being a data scientist can sometimes feel like a race against software commodities that replace interesting work. What can one do to remain relevant?</p></div><footer class=entry-footer><span title='2020-01-11 09:22:35 +0000 UTC'>January 11, 2020</span></footer><a class=entry-link aria-label="post link to Software commodities are eating interesting data science work" href=https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>A day in the life of a remote data scientist</h2></header><div class=entry-content><p>Video of a talk I gave on remote data science work at the Data Science Sydney meetup.</p></div><footer class=entry-footer><span title='2019-12-11 22:06:19 +0000 UTC'>December 11, 2019</span></footer><a class=entry-link aria-label="post link to A day in the life of a remote data scientist" href=https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Bootstrapping the right way?</h2></header><div class=entry-content><p>Video and summary of a talk I gave at YOW! Data on bootstrap estimation of confidence intervals.</p></div><footer class=entry-footer><span title='2019-10-06 06:48:07 +0000 UTC'>October 6, 2019</span></footer><a class=entry-link aria-label="post link to Bootstrapping the right way?" href=https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Hackers beware: Bootstrap sampling may be harmful</h2></header><div class=entry-content><p>Bootstrap sampling has been promoted as an easy way of modelling uncertainty to hackers without much statistical knowledge. But things aren’t that simple.</p></div><footer class=entry-footer><span title='2019-01-07 21:07:56 +0000 UTC'>January 7, 2019</span></footer><a class=entry-link aria-label="post link to Hackers beware: Bootstrap sampling may be harmful" href=https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The most practical causal inference book I’ve read (is still a draft)</h2></header><div class=entry-content><p>Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area.</p></div><footer class=entry-footer><span title='2018-12-24 02:37:50 +0000 UTC'>December 24, 2018</span></footer><a class=entry-link aria-label="post link to The most practical causal inference book I’ve read (is still a draft)" href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Reflections on remote data science work</h2></header><div class=entry-content><p>Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist.</p></div><footer class=entry-footer><span title='2018-11-03 06:33:13 +0000 UTC'>November 3, 2018</span></footer><a class=entry-link aria-label="post link to Reflections on remote data science work" href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Defining data science in 2018</h2></header><div class=entry-content><p>Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions.</p></div><footer class=entry-footer><span title='2018-07-22 08:27:43 +0000 UTC'>July 22, 2018</span></footer><a class=entry-link aria-label="post link to Defining data science in 2018" href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Advice for aspiring data scientists and other FAQs</h2></header><div class=entry-content><p>Frequently asked questions by visitors to this site, especially around entering the data science field.</p></div><footer class=entry-footer><span title='2017-10-15 09:15:25 +0000 UTC'>October 15, 2017</span></footer><a class=entry-link aria-label="post link to Advice for aspiring data scientists and other FAQs" href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>State of Bandcamp Recommender, Late 2017</h2></header><div class=entry-content><p>Call for BCRecommender maintainers followed by a decision to shut it down, as I don’t have enough time and Bandcamp now offers recommendations.</p></div><footer class=entry-footer><span title='2017-09-02 10:19:02 +0000 UTC'>September 2, 2017</span></footer><a class=entry-link aria-label="post link to State of Bandcamp Recommender, Late 2017" href=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My 10-step path to becoming a remote data scientist with Automattic</h2></header><div class=entry-content><p>I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually.</p></div><footer class=entry-footer><span title='2017-07-29 05:39:26 +0000 UTC'>July 29, 2017</span></footer><a class=entry-link aria-label="post link to My 10-step path to becoming a remote data scientist with Automattic" href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Exploring and visualising Reef Life Survey data</h2></header><div class=entry-content><p>Web tools I built to visualise Reef Life Survey data and assist citizen scientists in underwater visual census work.</p></div><footer class=entry-footer><span title='2017-06-03 00:49:05 +0000 UTC'>June 3, 2017</span></footer><a class=entry-link aria-label="post link to Exploring and visualising Reef Life Survey data" href=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Customer lifetime value and the proliferation of misinformation on the internet</h2></header><div class=entry-content><p>There’s a lot of misleading content on the estimation of customer lifetime value. Here’s what I learned about doing it well.</p></div><footer class=entry-footer><span title='2017-01-08 20:02:30 +0000 UTC'>January 8, 2017</span></footer><a class=entry-link aria-label="post link to Customer lifetime value and the proliferation of misinformation on the internet" href=https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Ask Why! Finding motives, causes, and purpose in data science</h2></header><div class=entry-content><p>Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what & how of predictive modelling.</p></div><footer class=entry-footer><span title='2016-09-19 21:28:44 +0000 UTC'>September 19, 2016</span></footer><a class=entry-link aria-label="post link to Ask Why! Finding motives, causes, and purpose in data science" href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>If you don’t pay attention, data can drive you off a cliff</h2></header><div class=entry-content><p>Seven common mistakes to avoid when working with data, such as ignoring uncertainty and confusing observed and unobserved quantities.</p></div><footer class=entry-footer><span title='2016-08-21 21:34:17 +0000 UTC'>August 21, 2016</span></footer><a class=entry-link aria-label="post link to If you don’t pay attention, data can drive you off a cliff" href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is Data Scientist a useless job title?</h2></header><div class=entry-content><p>It seems like anyone who touches data can call themselves a data scientist, which makes the title useless. The work they do can still be useful, though.</p></div><footer class=entry-footer><span title='2016-08-04 22:26:03 +0000 UTC'>August 4, 2016</span></footer><a class=entry-link aria-label="post link to Is Data Scientist a useless job title?" href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Making Bayesian A/B testing more accessible</h2></header><div class=entry-content><p>A web tool I built to interpret A/B test results in a Bayesian way, including prior specification, visualisations, and decision rules.</p></div><footer class=entry-footer><span title='2016-06-19 10:32:15 +0000 UTC'>June 19, 2016</span></footer><a class=entry-link aria-label="post link to Making Bayesian A/B testing more accessible" href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</h2></header><div class=entry-content><p>Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg’s Causality, Probability, and Time.</p></div><footer class=entry-footer><span title='2016-05-14 19:57:03 +0000 UTC'>May 14, 2016</span></footer><a class=entry-link aria-label="post link to Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions" href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The rise of greedy robots</h2></header><div class=entry-content><p>Is artificial/machine intelligence a future threat? I argue that it’s already here, with greedy robots already dominating our lives.</p></div><footer class=entry-footer><span title='2016-03-20 20:33:43 +0000 UTC'>March 20, 2016</span></footer><a class=entry-link aria-label="post link to The rise of greedy robots" href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Why you should stop worrying about deep learning and deepen your understanding of causality instead</h2></header><div class=entry-content><p>Causality is often overlooked but is of much higher relevance to most data scientists than deep learning.</p></div><footer class=entry-footer><span title='2016-02-14 11:04:11 +0000 UTC'>February 14, 2016</span></footer><a class=entry-link aria-label="post link to Why you should stop worrying about deep learning and deepen your understanding of causality instead" href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The joys of offline data collection</h2></header><div class=entry-content><p>Insights on data collection and machine learning from spending a month sailing, diving, and counting fish with Reef Life Survey.</p></div><footer class=entry-footer><span title='2016-01-24 00:32:25 +0000 UTC'>January 24, 2016</span></footer><a class=entry-link aria-label="post link to The joys of offline data collection" href=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>This holiday season, give me real insights</h2></header><div class=entry-content><p>Some companies present raw data or information as “insights”. This post surveys some examples, and discusses how they can be turned into real insights.</p></div><footer class=entry-footer><span title='2015-12-08 06:57:25 +0000 UTC'>December 8, 2015</span></footer><a class=entry-link aria-label="post link to This holiday season, give me real insights" href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The hardest parts of data science</h2></header><div class=entry-content><p>Defining feasible problems and coming up with reasonable ways of measuring solutions is harder than building accurate models or obtaining clean data.</p></div><footer class=entry-footer><span title='2015-11-23 04:14:21 +0000 UTC'>November 23, 2015</span></footer><a class=entry-link aria-label="post link to The hardest parts of data science" href=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Migrating a simple web application from MongoDB to Elasticsearch</h2></header><div class=entry-content><p>Migrating BCRecommender from MongoDB to Elasticsearch made it possible to offer a richer search experience to users at a similar cost, among other benefits.</p></div><footer class=entry-footer><span title='2015-11-04 03:53:18 +0000 UTC'>November 4, 2015</span></footer><a class=entry-link aria-label="post link to Migrating a simple web application from MongoDB to Elasticsearch" href=https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling</h2></header><div class=entry-content><p>Nutritionism is a special case of misinterpretation and miscommunication of scientific results – something many data scientists encounter in their work.</p></div><footer class=entry-footer><span title='2015-10-19 00:02:32 +0000 UTC'>October 19, 2015</span></footer><a class=entry-link aria-label="post link to Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling" href=https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The wonderful world of recommender systems</h2></header><div class=entry-content><p>Giving an overview of the field and common paradigms, and debunking five common myths about recommender systems.</p></div><footer class=entry-footer><span title='2015-10-02 05:25:57 +0000 UTC'>October 2, 2015</span></footer><a class=entry-link aria-label="post link to The wonderful world of recommender systems" href=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>You don’t need a data scientist (yet)</h2></header><div class=entry-content><p>Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist.</p></div><footer class=entry-footer><span title='2015-08-24 08:25:30 +0000 UTC'>August 24, 2015</span></footer><a class=entry-link aria-label="post link to You don’t need a data scientist (yet)" href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Goodbye, Parse.com</h2></header><div class=entry-content><p>Migrating my web apps away from Parse.com due to reliability issues. Self-hosting is a better solution.</p></div><footer class=entry-footer><span title='2015-07-31 03:29:50 +0000 UTC'>July 31, 2015</span></footer><a class=entry-link aria-label="post link to Goodbye, Parse.com" href=https://yanirseroussi.com/2015/07/31/goodbye-parse-com/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Learning about deep learning through album cover classification</h2></header><div class=entry-content><p>Progress on my album cover classification project, highlighting lessons that would be useful to others who are getting started with deep learning.</p></div><footer class=entry-footer><span title='2015-07-06 22:21:42 +0000 UTC'>July 6, 2015</span></footer><a class=entry-link aria-label="post link to Learning about deep learning through album cover classification" href=https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Hopping on the deep learning bandwagon</h2></header><div class=entry-content><p>To become proficient at solving data science problems, you need to get your hands dirty. Here, I used album cover classification to learn about deep learning.</p></div><footer class=entry-footer><span title='2015-06-06 05:00:22 +0000 UTC'>June 6, 2015</span></footer><a class=entry-link aria-label="post link to Hopping on the deep learning bandwagon" href=https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>First steps in data science: author-aware sentiment analysis</h2></header><div class=entry-content><p>I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.</p></div><footer class=entry-footer><span title='2015-05-02 08:31:10 +0000 UTC'>May 2, 2015</span></footer><a class=entry-link aria-label="post link to First steps in data science: author-aware sentiment analysis" href=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My divestment from fossil fuels</h2></header><div class=entry-content><p>Recent choices I’ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons.</p></div><footer class=entry-footer><span title='2015-04-24 00:19:36 +0000 UTC'>April 24, 2015</span></footer><a class=entry-link aria-label="post link to My divestment from fossil fuels" href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My PhD work</h2></header><div class=entry-content><p>An overview of my PhD in data science / artificial intelligence. Thesis title: Text Mining and Rating Prediction with Topical User Models.</p></div><footer class=entry-footer><span title='2015-03-30 03:23:33 +0000 UTC'>March 30, 2015</span></footer><a class=entry-link aria-label="post link to My PhD work" href=https://yanirseroussi.com/phd-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The long road to a lifestyle business</h2></header><div class=entry-content><p>Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects.</p></div><footer class=entry-footer><span title='2015-03-22 09:43:47 +0000 UTC'>March 22, 2015</span></footer><a class=entry-link aria-label="post link to The long road to a lifestyle business" href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)</h2></header><div class=entry-content><p>My team’s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams).</p></div><footer class=entry-footer><span title='2015-02-11 06:34:17 +0000 UTC'>February 11, 2015</span></footer><a class=entry-link aria-label="post link to Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)" href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)</h2></header><div class=entry-content><p>Insights on search personalisation and SEO from participating in a Kaggle competition (finished 9th out of 194 teams).</p></div><footer class=entry-footer><span title='2015-01-29 10:37:39 +0000 UTC'>January 29, 2015</span></footer><a class=entry-link aria-label="post link to Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)" href=https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Automating Parse.com bulk data imports</h2></header><div class=entry-content><p>A script for importing data into the Parse backend-as-a-service.</p></div><footer class=entry-footer><span title='2015-01-15 04:41:16 +0000 UTC'>January 15, 2015</span></footer><a class=entry-link aria-label="post link to Automating Parse.com bulk data imports" href=https://yanirseroussi.com/2015/01/15/automating-parse-com-bulk-data-imports/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Stochastic Gradient Boosting: Choosing the Best Number of Iterations</h2></header><div class=entry-content><p>Exploring an approach to choosing the optimal number of iterations in stochastic gradient boosting, following a bug I found in scikit-learn.</p></div><footer class=entry-footer><span title='2014-12-29 02:30:06 +0000 UTC'>December 29, 2014</span></footer><a class=entry-link aria-label="post link to Stochastic Gradient Boosting: Choosing the Best Number of Iterations" href=https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>SEO: Mostly about showing up?</h2></header><div class=entry-content><p>Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content.</p></div><footer class=entry-footer><span title='2014-12-15 04:25:25 +0000 UTC'>December 15, 2014</span></footer><a class=entry-link aria-label="post link to SEO: Mostly about showing up?" href=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)</h2></header><div class=entry-content><p>Summary of a Kaggle competition to forecast bulldozer sale price, where I finished 9th out of 476 teams.</p></div><footer class=entry-footer><span title='2014-11-19 09:17:34 +0000 UTC'>November 19, 2014</span></footer><a class=entry-link aria-label="post link to Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)" href=https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>BCRecommender Traction Update</h2></header><div class=entry-content><p>Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing.</p></div><footer class=entry-footer><span title='2014-11-05 02:29:35 +0000 UTC'>November 5, 2014</span></footer><a class=entry-link aria-label="post link to BCRecommender Traction Update" href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>What is data science?</h2></header><div class=entry-content><p>Data science has been a hot term in the past few years. Still, there isn’t a single definition of the field. This post discusses my favourite definition.</p></div><footer class=entry-footer><span title='2014-10-23 03:22:08 +0000 UTC'>October 23, 2014</span></footer><a class=entry-link aria-label="post link to What is data science?" href=https://yanirseroussi.com/2014/10/23/what-is-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Greek Media Monitoring Kaggle competition: My approach</h2></header><div class=entry-content><p>Summary of my approach to the Greek Media Monitoring Kaggle competition, where I finished 6th out of 120 teams.</p></div><footer class=entry-footer><span title='2014-10-07 03:21:35 +0000 UTC'>October 7, 2014</span></footer><a class=entry-link aria-label="post link to Greek Media Monitoring Kaggle competition: My approach" href=https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Applying the Traction Book’s Bullseye framework to BCRecommender</h2></header><div class=entry-content><p>Ranking 19 channels with the goal of getting traction for BCRecommender.</p></div><footer class=entry-footer><span title='2014-09-24 04:57:39 +0000 UTC'>September 24, 2014</span></footer><a class=entry-link aria-label="post link to Applying the Traction Book’s Bullseye framework to BCRecommender" href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Bandcamp recommendation and discovery algorithms</h2></header><div class=entry-content><p>The recommendation backend for my BCRecommender service for personalised Bandcamp music discovery.</p></div><footer class=entry-footer><span title='2014-09-19 14:26:55 +0000 UTC'>September 19, 2014</span></footer><a class=entry-link aria-label="post link to Bandcamp recommendation and discovery algorithms" href=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)</h2></header><div class=entry-content><p>Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service.</p></div><footer class=entry-footer><span title='2014-09-07 10:48:44 +0000 UTC'>September 7, 2014</span></footer><a class=entry-link aria-label="post link to Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)" href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building a Bandcamp recommender system (part 1 – motivation)</h2></header><div class=entry-content><p>My motivation behind building BCRecommender, a free recommendation & discovery service for Bandcamp music.</p></div><footer class=entry-footer><span title='2014-08-30 08:11:38 +0000 UTC'>August 30, 2014</span></footer><a class=entry-link aria-label="post link to Building a Bandcamp recommender system (part 1 – motivation)" href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How to (almost) win Kaggle competitions</h2></header><div class=entry-content><p>Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions.</p></div><footer class=entry-footer><span title='2014-08-24 12:40:53 +0000 UTC'>August 24, 2014</span></footer><a class=entry-link aria-label="post link to How to (almost) win Kaggle competitions" href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Data’s hierarchy of needs</h2></header><div class=entry-content><p>Discussing the hierarchy of needs proposed by Jay Kreps. Key takeaway: Data-driven algorithms & insights can only be as good as the underlying data.</p></div><footer class=entry-footer><span title='2014-08-17 13:09:30 +0000 UTC'>August 17, 2014</span></footer><a class=entry-link aria-label="post link to Data’s hierarchy of needs" href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Kaggle competition tips and summaries</h2></header><div class=entry-content><p>Pointers to all my Kaggle advice posts and competition summaries.</p></div><footer class=entry-footer><span title='2014-04-05 23:46:10 +0000 UTC'>April 5, 2014</span></footer><a class=entry-link aria-label="post link to Kaggle competition tips and summaries" href=https://yanirseroussi.com/kaggle/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Kaggle beginner tips</h2></header><div class=entry-content><p>First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions.</p></div><footer class=entry-footer><span title='2014-01-19 10:34:28 +0000 UTC'>January 19, 2014</span></footer><a class=entry-link aria-label="post link to Kaggle beginner tips" href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/></a></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div></div></header><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is your tech stack ready for data-intensive applications?</h2></header><div class=entry-content><p>Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics.</p></div><footer class=entry-footer><span title='2024-06-24 02:00:00 +0000 UTC'>June 24, 2024</span></footer><a class=entry-link aria-label="post link to Is your tech stack ready for data-intensive applications?" href=https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>AI ain't gonna save you from bad data</h2></header><div class=entry-content><p>Since we’re far from a utopia where data issues are fully handled by AI, this post presents six questions humans can use to assess data projects.</p></div><footer class=entry-footer><span title='2024-06-17 02:00:00 +0000 UTC'>June 17, 2024</span></footer><a class=entry-link aria-label="post link to AI ain't gonna save you from bad data" href=https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Startup data health starts with healthy event tracking</h2></header><div class=entry-content><p>Expanding on the startup health check question of tracking Kukuyeva’s five business aspects as wide events.</p></div><footer class=entry-footer><span title='2024-06-10 04:00:00 +0000 UTC'>June 10, 2024</span></footer><a class=entry-link aria-label="post link to Startup data health starts with healthy event tracking" href=https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How to avoid startups with poor development processes</h2></header><div class=entry-content><p>Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role.</p></div><footer class=entry-footer><span title='2024-06-03 02:45:00 +0000 UTC'>June 3, 2024</span></footer><a class=entry-link aria-label="post link to How to avoid startups with poor development processes" href=https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Plumbing, Decisions, and Automation: De-hyping Data & AI</h2></header><div class=entry-content><p>Three essential questions to understand where an organisation stands when it comes to Data & AI (with zero hype).</p></div><footer class=entry-footer><span title='2024-05-27 02:00:00 +0000 UTC'>May 27, 2024</span></footer><a class=entry-link aria-label="post link to Plumbing, Decisions, and Automation: De-hyping Data & AI" href=https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Question startup culture before accepting a data-to-AI role</h2></header><div class=entry-content><p>Eight questions that prospective data-to-AI employees should ask about a startup’s work and data culture.</p></div><footer class=entry-footer><span title='2024-05-20 02:25:00 +0000 UTC'>May 20, 2024</span></footer><a class=entry-link aria-label="post link to Question startup culture before accepting a data-to-AI role" href=https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Probing the People aspects of an early-stage startup</h2></header><div class=entry-content><p>Ten questions that prospective employees should ask about a startup’s team, especially for data-centric roles.</p></div><footer class=entry-footer><span title='2024-05-13 02:00:00 +0000 UTC'>May 13, 2024</span></footer><a class=entry-link aria-label="post link to Probing the People aspects of an early-stage startup" href=https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Business questions to ask before taking a startup data role</h2></header><div class=entry-content><p>Fourteen questions that prospective employees should ask about a startup’s business model and product, especially for data-focused roles.</p></div><footer class=entry-footer><span title='2024-05-06 04:30:00 +0000 UTC'>May 6, 2024</span></footer><a class=entry-link aria-label="post link to Business questions to ask before taking a startup data role" href=https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Mentorship and the art of actionable advice</h2></header><div class=entry-content><p>Reflections on what it takes to package expertise and deliver timely, actionable advice outside the context of employee relationships.</p></div><footer class=entry-footer><span title='2024-04-29 06:30:00 +0000 UTC'>April 29, 2024</span></footer><a class=entry-link aria-label="post link to Mentorship and the art of actionable advice" href=https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Assessing a startup's data-to-AI health</h2></header><div class=entry-content><p>Reviewing the areas that should be assessed to determine a startup’s opportunities and challenges on the data/AI/ML front.</p></div><footer class=entry-footer><span title='2024-04-22 06:00:00 +0000 UTC'>April 22, 2024</span></footer><a class=entry-link aria-label="post link to Assessing a startup's data-to-AI health" href=https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>AI does not obviate the need for testing and observability</h2></header><div class=entry-content><p>It’s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software.</p></div><footer class=entry-footer><span title='2024-04-15 05:00:00 +0000 UTC'>April 15, 2024</span></footer><a class=entry-link aria-label="post link to AI does not obviate the need for testing and observability" href=https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My experience as a Data Tech Lead with Work on Climate</h2></header><div class=entry-content><p>The story of how I joined Work on Climate as a volunteer and became its data tech lead, with lessons applied to consulting & fractional work.</p></div><footer class=entry-footer><span title='2024-04-08 02:00:00 +0000 UTC'>April 8, 2024</span></footer><a class=entry-link aria-label="post link to My experience as a Data Tech Lead with Work on Climate" href=https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Artificial intelligence, automation, and the art of counting fish</h2></header><div class=entry-content><p>Discussing the use of AI to automate underwater marine surveys as an example of the uneven distribution of technological advancement.</p></div><footer class=entry-footer><span title='2024-04-01 06:00:00 +0000 UTC'>April 1, 2024</span></footer><a class=entry-link aria-label="post link to Artificial intelligence, automation, and the art of counting fish" href=https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Questions to consider when using AI for PDF data extraction</h2></header><div class=entry-content><p>Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents.</p></div><footer class=entry-footer><span title='2024-03-11 00:00:00 +0000 UTC'>March 11, 2024</span></footer><a class=entry-link aria-label="post link to Questions to consider when using AI for PDF data extraction" href=https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Two types of startup data problems</h2></header><div class=entry-content><p>Classifying startups as ML-centric or non-ML is a helpful exercise to uncover the data challenges they’re likely to face.</p></div><footer class=entry-footer><span title='2024-03-04 02:00:00 +0000 UTC'>March 4, 2024</span></footer><a class=entry-link aria-label="post link to Two types of startup data problems" href=https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Avoiding AI complexity: First, write no code</h2></header><div class=entry-content><p>Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach.</p></div><footer class=entry-footer><span title='2024-02-26 01:45:00 +0000 UTC'>February 26, 2024</span></footer><a class=entry-link aria-label="post link to Avoiding AI complexity: First, write no code" href=https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building your startup's minimum viable data stack</h2></header><div class=entry-content><p>First post in a series on building a minimum viable data stack for startups, introducing key definitions, components, and considerations.</p></div><footer class=entry-footer><span title='2024-02-19 00:00:00 +0000 UTC'>February 19, 2024</span></footer><a class=entry-link aria-label="post link to Building your startup's minimum viable data stack" href=https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Nudging ChatGPT to invent books you have no time to read</h2></header><div class=entry-content><p>Getting ChatGPT Plus to elaborate on possible book content and produce a PDF cheatsheet, with the goal of learning about its capabilities.</p></div><footer class=entry-footer><span title='2024-02-12 05:00:00 +0000 UTC'>February 12, 2024</span></footer><a class=entry-link aria-label="post link to Nudging ChatGPT to invent books you have no time to read" href=https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Substance over titles: Your first data hire may be a data scientist</h2></header><div class=entry-content><p>Advice for hiring a startup’s first data person: match skills to business needs, consider contractors, and get help from data people.</p></div><footer class=entry-footer><span title='2024-02-05 02:45:00 +0000 UTC'>February 5, 2024</span></footer><a class=entry-link aria-label="post link to Substance over titles: Your first data hire may be a data scientist" href=https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>New decade, new tagline: Data & AI for Impact</h2></header><div class=entry-content><p>Shifting focus to ‘Data & AI for Impact’, with more startup-related content, increased posting frequency, and deeper audience engagement.</p></div><footer class=entry-footer><span title='2024-01-19 00:00:00 +0000 UTC'>January 19, 2024</span></footer><a class=entry-link aria-label="post link to New decade, new tagline: Data & AI for Impact" href=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Supporting volunteer monitoring of marine biodiversity with modern web and data tools</h2></header><div class=entry-content><p>Summarising the work Uri Seroussi and I did to improve Reef Life Survey’s Reef Species of the World app.</p></div><footer class=entry-footer><span title='2023-11-29 02:00:00 +0000 UTC'>November 29, 2023</span></footer><a class=entry-link aria-label="post link to Supporting volunteer monitoring of marine biodiversity with modern web and data tools" href=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Lessons from reluctant data engineering</h2></header><div class=entry-content><p>Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had.</p></div><footer class=entry-footer><span title='2023-10-25 04:45:00 +0000 UTC'>October 25, 2023</span></footer><a class=entry-link aria-label="post link to Lessons from reluctant data engineering" href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My rediscovery of quiet writing on the open web</h2></header><div class=entry-content><p>Reflections on publishing on this website: Writing publicly to share thoughts and documentation beats chasing views and likes.</p></div><footer class=entry-footer><span title='2023-08-28 05:30:00 +0000 UTC'>August 28, 2023</span></footer><a class=entry-link aria-label="post link to My rediscovery of quiet writing on the open web" href=https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Was data science a failure mode of software engineering?</h2></header><div class=entry-content><p>Yes, data science projects have suffered from classic software engineering mistakes, but the field is maturing with the rise of new engineering roles.</p></div><footer class=entry-footer><span title='2023-06-30 00:06:30 +0000 UTC'>June 30, 2023</span></footer><a class=entry-link aria-label="post link to Was data science a failure mode of software engineering?" href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How hackable are automated coding assessments?</h2></header><div class=entry-content><p>Exploring the hackability of speed-based coding tests, using CodeSignal’s Industry Coding Framework as a case study.</p></div><footer class=entry-footer><span title='2023-05-26 00:03:00 +0000 UTC'>May 26, 2023</span></footer><a class=entry-link aria-label="post link to How hackable are automated coding assessments?" href=https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Remaining relevant as a small language model</h2></header><div class=entry-content><p>Bing Chat recently quipped that humans are small language models. Here are some of my thoughts on how we small language models can remain relevant (for now).</p></div><footer class=entry-footer><span title='2023-04-21 00:06:30 +0000 UTC'>April 21, 2023</span></footer><a class=entry-link aria-label="post link to Remaining relevant as a small language model" href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>ChatGPT is transformative AI</h2></header><div class=entry-content><p>My perspective after a week of using ChatGPT: This is a step change in finding distilled information, and it’s only the beginning.</p></div><footer class=entry-footer><span title='2022-12-11 00:00:00 +0000 UTC'>December 11, 2022</span></footer><a class=entry-link aria-label="post link to ChatGPT is transformative AI" href=https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Causal Machine Learning is off to a good start, despite some issues</h2></header><div class=entry-content><p>Reviewing the first three chapters of the book Causal Machine Learning by Robert Osazuwa Ness.</p></div><footer class=entry-footer><span title='2022-09-12 02:45:00 +0000 UTC'>September 12, 2022</span></footer><a class=entry-link aria-label="post link to Causal Machine Learning is off to a good start, despite some issues" href=https://yanirseroussi.com/2022/09/12/causal-machine-learning-book-draft-review/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The mission matters: Moving to climate tech as a data scientist</h2></header><div class=entry-content><p>Discussing my recent career move into climate tech as a way of doing more to help mitigate dangerous climate change.</p></div><footer class=entry-footer><span title='2022-06-06 00:00:00 +0000 UTC'>June 6, 2022</span></footer><a class=entry-link aria-label="post link to The mission matters: Moving to climate tech as a data scientist" href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building useful machine learning tools keeps getting easier: A fish ID case study</h2></header><div class=entry-content><p>Lessons learned building a fish ID web app with fast.ai and Streamlit, in an attempt to reduce my fear of missing out on the latest deep learning developments.</p></div><footer class=entry-footer><span title='2022-03-20 04:30:00 +0000 UTC'>March 20, 2022</span></footer><a class=entry-link aria-label="post link to Building useful machine learning tools keeps getting easier: A fish ID case study" href=https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials</h2></header><div class=entry-content><p>Epidemiologists analyse clinical trials to estimate the intention-to-treat and per-protocol effects. This post applies their strategies to online experiments.</p></div><footer class=entry-footer><span title='2022-01-14 00:05:40 +0000 UTC'>January 14, 2022</span></footer><a class=entry-link aria-label="post link to Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials" href=https://yanirseroussi.com/2022/01/14/analysis-strategies-in-online-a-b-experiments/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Use your human brain to avoid artificial intelligence disasters</h2></header><div class=entry-content><p>Overview of a talk I gave at a deep learning course, focusing on AI ethics as the need for humans to think on the context and consequences of applying AI.</p></div><footer class=entry-footer><span title='2021-11-22 03:45:00 +0000 UTC'>November 22, 2021</span></footer><a class=entry-link aria-label="post link to Use your human brain to avoid artificial intelligence disasters" href=https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Migrating from WordPress.com to Hugo on GitHub + Cloudflare</h2></header><div class=entry-content><p>My reasons for switching from WordPress.com to Hugo on GitHub + Cloudflare, along with a summary of the solution components and migration process.</p></div><footer class=entry-footer><span title='2021-11-10 06:30:00 +0000 UTC'>November 10, 2021</span></footer><a class=entry-link aria-label="post link to Migrating from WordPress.com to Hugo on GitHub + Cloudflare" href=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My work with Automattic</h2></header><div class=entry-content><p>Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I’ve done with the company.</p></div><footer class=entry-footer><span title='2021-10-07 00:00:00 +0000 UTC'>October 7, 2021</span></footer><a class=entry-link aria-label="post link to My work with Automattic" href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Some highlights from 2020</h2></header><div class=entry-content><p>Sharing remote teamwork insights, my climate & sustainability activism, Reef Life Survey publications, and progress on Automattic’s Experimentation Platform.</p></div><footer class=entry-footer><span title='2021-04-05 06:41:48 +0000 UTC'>April 5, 2021</span></footer><a class=entry-link aria-label="post link to Some highlights from 2020" href=https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Many is not enough: Counting simulations to bootstrap the right way</h2></header><div class=entry-content><p>Going deeper into correct testing of different methods for bootstrap estimation of confidence intervals.</p></div><footer class=entry-footer><span title='2020-08-24 01:35:17 +0000 UTC'>August 24, 2020</span></footer><a class=entry-link aria-label="post link to Many is not enough: Counting simulations to bootstrap the right way" href=https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Software commodities are eating interesting data science work</h2></header><div class=entry-content><p>Being a data scientist can sometimes feel like a race against software commodities that replace interesting work. What can one do to remain relevant?</p></div><footer class=entry-footer><span title='2020-01-11 09:22:35 +0000 UTC'>January 11, 2020</span></footer><a class=entry-link aria-label="post link to Software commodities are eating interesting data science work" href=https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>A day in the life of a remote data scientist</h2></header><div class=entry-content><p>Video of a talk I gave on remote data science work at the Data Science Sydney meetup.</p></div><footer class=entry-footer><span title='2019-12-11 22:06:19 +0000 UTC'>December 11, 2019</span></footer><a class=entry-link aria-label="post link to A day in the life of a remote data scientist" href=https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Bootstrapping the right way?</h2></header><div class=entry-content><p>Video and summary of a talk I gave at YOW! Data on bootstrap estimation of confidence intervals.</p></div><footer class=entry-footer><span title='2019-10-06 06:48:07 +0000 UTC'>October 6, 2019</span></footer><a class=entry-link aria-label="post link to Bootstrapping the right way?" href=https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Hackers beware: Bootstrap sampling may be harmful</h2></header><div class=entry-content><p>Bootstrap sampling has been promoted as an easy way of modelling uncertainty to hackers without much statistical knowledge. But things aren’t that simple.</p></div><footer class=entry-footer><span title='2019-01-07 21:07:56 +0000 UTC'>January 7, 2019</span></footer><a class=entry-link aria-label="post link to Hackers beware: Bootstrap sampling may be harmful" href=https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The most practical causal inference book I’ve read (is still a draft)</h2></header><div class=entry-content><p>Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area.</p></div><footer class=entry-footer><span title='2018-12-24 02:37:50 +0000 UTC'>December 24, 2018</span></footer><a class=entry-link aria-label="post link to The most practical causal inference book I’ve read (is still a draft)" href=https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Reflections on remote data science work</h2></header><div class=entry-content><p>Discussing the pluses and minuses of remote work eighteen months after joining Automattic as a data scientist.</p></div><footer class=entry-footer><span title='2018-11-03 06:33:13 +0000 UTC'>November 3, 2018</span></footer><a class=entry-link aria-label="post link to Reflections on remote data science work" href=https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Defining data science in 2018</h2></header><div class=entry-content><p>Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions.</p></div><footer class=entry-footer><span title='2018-07-22 08:27:43 +0000 UTC'>July 22, 2018</span></footer><a class=entry-link aria-label="post link to Defining data science in 2018" href=https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Advice for aspiring data scientists and other FAQs</h2></header><div class=entry-content><p>Frequently asked questions by visitors to this site, especially around entering the data science field.</p></div><footer class=entry-footer><span title='2017-10-15 09:15:25 +0000 UTC'>October 15, 2017</span></footer><a class=entry-link aria-label="post link to Advice for aspiring data scientists and other FAQs" href=https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>State of Bandcamp Recommender, Late 2017</h2></header><div class=entry-content><p>Call for BCRecommender maintainers followed by a decision to shut it down, as I don’t have enough time and Bandcamp now offers recommendations.</p></div><footer class=entry-footer><span title='2017-09-02 10:19:02 +0000 UTC'>September 2, 2017</span></footer><a class=entry-link aria-label="post link to State of Bandcamp Recommender, Late 2017" href=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My 10-step path to becoming a remote data scientist with Automattic</h2></header><div class=entry-content><p>I wanted a well-paid data science-y remote job with an established company that offers a good life balance and makes products I care about. I got it eventually.</p></div><footer class=entry-footer><span title='2017-07-29 05:39:26 +0000 UTC'>July 29, 2017</span></footer><a class=entry-link aria-label="post link to My 10-step path to becoming a remote data scientist with Automattic" href=https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Exploring and visualising Reef Life Survey data</h2></header><div class=entry-content><p>Web tools I built to visualise Reef Life Survey data and assist citizen scientists in underwater visual census work.</p></div><footer class=entry-footer><span title='2017-06-03 00:49:05 +0000 UTC'>June 3, 2017</span></footer><a class=entry-link aria-label="post link to Exploring and visualising Reef Life Survey data" href=https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Customer lifetime value and the proliferation of misinformation on the internet</h2></header><div class=entry-content><p>There’s a lot of misleading content on the estimation of customer lifetime value. Here’s what I learned about doing it well.</p></div><footer class=entry-footer><span title='2017-01-08 20:02:30 +0000 UTC'>January 8, 2017</span></footer><a class=entry-link aria-label="post link to Customer lifetime value and the proliferation of misinformation on the internet" href=https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Ask Why! Finding motives, causes, and purpose in data science</h2></header><div class=entry-content><p>Video and summary of a talk I gave at the Data Science Sydney meetup, about going beyond the what & how of predictive modelling.</p></div><footer class=entry-footer><span title='2016-09-19 21:28:44 +0000 UTC'>September 19, 2016</span></footer><a class=entry-link aria-label="post link to Ask Why! Finding motives, causes, and purpose in data science" href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>If you don’t pay attention, data can drive you off a cliff</h2></header><div class=entry-content><p>Seven common mistakes to avoid when working with data, such as ignoring uncertainty and confusing observed and unobserved quantities.</p></div><footer class=entry-footer><span title='2016-08-21 21:34:17 +0000 UTC'>August 21, 2016</span></footer><a class=entry-link aria-label="post link to If you don’t pay attention, data can drive you off a cliff" href=https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is Data Scientist a useless job title?</h2></header><div class=entry-content><p>It seems like anyone who touches data can call themselves a data scientist, which makes the title useless. The work they do can still be useful, though.</p></div><footer class=entry-footer><span title='2016-08-04 22:26:03 +0000 UTC'>August 4, 2016</span></footer><a class=entry-link aria-label="post link to Is Data Scientist a useless job title?" href=https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Making Bayesian A/B testing more accessible</h2></header><div class=entry-content><p>A web tool I built to interpret A/B test results in a Bayesian way, including prior specification, visualisations, and decision rules.</p></div><footer class=entry-footer><span title='2016-06-19 10:32:15 +0000 UTC'>June 19, 2016</span></footer><a class=entry-link aria-label="post link to Making Bayesian A/B testing more accessible" href=https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions</h2></header><div class=entry-content><p>Discussing the need for untested assumptions and temporality in causal inference. Mostly based on Samantha Kleinberg’s Causality, Probability, and Time.</p></div><footer class=entry-footer><span title='2016-05-14 19:57:03 +0000 UTC'>May 14, 2016</span></footer><a class=entry-link aria-label="post link to Diving deeper into causality: Pearl, Kleinberg, Hill, and untested assumptions" href=https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The rise of greedy robots</h2></header><div class=entry-content><p>Is artificial/machine intelligence a future threat? I argue that it’s already here, with greedy robots already dominating our lives.</p></div><footer class=entry-footer><span title='2016-03-20 20:33:43 +0000 UTC'>March 20, 2016</span></footer><a class=entry-link aria-label="post link to The rise of greedy robots" href=https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Why you should stop worrying about deep learning and deepen your understanding of causality instead</h2></header><div class=entry-content><p>Causality is often overlooked but is of much higher relevance to most data scientists than deep learning.</p></div><footer class=entry-footer><span title='2016-02-14 11:04:11 +0000 UTC'>February 14, 2016</span></footer><a class=entry-link aria-label="post link to Why you should stop worrying about deep learning and deepen your understanding of causality instead" href=https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The joys of offline data collection</h2></header><div class=entry-content><p>Insights on data collection and machine learning from spending a month sailing, diving, and counting fish with Reef Life Survey.</p></div><footer class=entry-footer><span title='2016-01-24 00:32:25 +0000 UTC'>January 24, 2016</span></footer><a class=entry-link aria-label="post link to The joys of offline data collection" href=https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>This holiday season, give me real insights</h2></header><div class=entry-content><p>Some companies present raw data or information as “insights”. This post surveys some examples, and discusses how they can be turned into real insights.</p></div><footer class=entry-footer><span title='2015-12-08 06:57:25 +0000 UTC'>December 8, 2015</span></footer><a class=entry-link aria-label="post link to This holiday season, give me real insights" href=https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The hardest parts of data science</h2></header><div class=entry-content><p>Defining feasible problems and coming up with reasonable ways of measuring solutions is harder than building accurate models or obtaining clean data.</p></div><footer class=entry-footer><span title='2015-11-23 04:14:21 +0000 UTC'>November 23, 2015</span></footer><a class=entry-link aria-label="post link to The hardest parts of data science" href=https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Migrating a simple web application from MongoDB to Elasticsearch</h2></header><div class=entry-content><p>Migrating BCRecommender from MongoDB to Elasticsearch made it possible to offer a richer search experience to users at a similar cost, among other benefits.</p></div><footer class=entry-footer><span title='2015-11-04 03:53:18 +0000 UTC'>November 4, 2015</span></footer><a class=entry-link aria-label="post link to Migrating a simple web application from MongoDB to Elasticsearch" href=https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling</h2></header><div class=entry-content><p>Nutritionism is a special case of misinterpretation and miscommunication of scientific results – something many data scientists encounter in their work.</p></div><footer class=entry-footer><span title='2015-10-19 00:02:32 +0000 UTC'>October 19, 2015</span></footer><a class=entry-link aria-label="post link to Miscommunicating science: Simplistic models, nutritionism, and the art of storytelling" href=https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The wonderful world of recommender systems</h2></header><div class=entry-content><p>Giving an overview of the field and common paradigms, and debunking five common myths about recommender systems.</p></div><footer class=entry-footer><span title='2015-10-02 05:25:57 +0000 UTC'>October 2, 2015</span></footer><a class=entry-link aria-label="post link to The wonderful world of recommender systems" href=https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>You don’t need a data scientist (yet)</h2></header><div class=entry-content><p>Hiring data scientists prematurely is wasteful and frustrating. Here are some questions to ask before you hire your first data scientist.</p></div><footer class=entry-footer><span title='2015-08-24 08:25:30 +0000 UTC'>August 24, 2015</span></footer><a class=entry-link aria-label="post link to You don’t need a data scientist (yet)" href=https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Goodbye, Parse.com</h2></header><div class=entry-content><p>Migrating my web apps away from Parse.com due to reliability issues. Self-hosting is a better solution.</p></div><footer class=entry-footer><span title='2015-07-31 03:29:50 +0000 UTC'>July 31, 2015</span></footer><a class=entry-link aria-label="post link to Goodbye, Parse.com" href=https://yanirseroussi.com/2015/07/31/goodbye-parse-com/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Learning about deep learning through album cover classification</h2></header><div class=entry-content><p>Progress on my album cover classification project, highlighting lessons that would be useful to others who are getting started with deep learning.</p></div><footer class=entry-footer><span title='2015-07-06 22:21:42 +0000 UTC'>July 6, 2015</span></footer><a class=entry-link aria-label="post link to Learning about deep learning through album cover classification" href=https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Hopping on the deep learning bandwagon</h2></header><div class=entry-content><p>To become proficient at solving data science problems, you need to get your hands dirty. Here, I used album cover classification to learn about deep learning.</p></div><footer class=entry-footer><span title='2015-06-06 05:00:22 +0000 UTC'>June 6, 2015</span></footer><a class=entry-link aria-label="post link to Hopping on the deep learning bandwagon" href=https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>First steps in data science: author-aware sentiment analysis</h2></header><div class=entry-content><p>I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.</p></div><footer class=entry-footer><span title='2015-05-02 08:31:10 +0000 UTC'>May 2, 2015</span></footer><a class=entry-link aria-label="post link to First steps in data science: author-aware sentiment analysis" href=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My divestment from fossil fuels</h2></header><div class=entry-content><p>Recent choices I’ve made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons.</p></div><footer class=entry-footer><span title='2015-04-24 00:19:36 +0000 UTC'>April 24, 2015</span></footer><a class=entry-link aria-label="post link to My divestment from fossil fuels" href=https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>My PhD work</h2></header><div class=entry-content><p>An overview of my PhD in data science / artificial intelligence. Thesis title: Text Mining and Rating Prediction with Topical User Models.</p></div><footer class=entry-footer><span title='2015-03-30 03:23:33 +0000 UTC'>March 30, 2015</span></footer><a class=entry-link aria-label="post link to My PhD work" href=https://yanirseroussi.com/phd-work/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>The long road to a lifestyle business</h2></header><div class=entry-content><p>Progress since leaving my last full-time job and setting on an independent path that includes data science consulting and work on my own projects.</p></div><footer class=entry-footer><span title='2015-03-22 09:43:47 +0000 UTC'>March 22, 2015</span></footer><a class=entry-link aria-label="post link to The long road to a lifestyle business" href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)</h2></header><div class=entry-content><p>My team’s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams).</p></div><footer class=entry-footer><span title='2015-02-11 06:34:17 +0000 UTC'>February 11, 2015</span></footer><a class=entry-link aria-label="post link to Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)" href=https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)</h2></header><div class=entry-content><p>Insights on search personalisation and SEO from participating in a Kaggle competition (finished 9th out of 194 teams).</p></div><footer class=entry-footer><span title='2015-01-29 10:37:39 +0000 UTC'>January 29, 2015</span></footer><a class=entry-link aria-label="post link to Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)" href=https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Automating Parse.com bulk data imports</h2></header><div class=entry-content><p>A script for importing data into the Parse backend-as-a-service.</p></div><footer class=entry-footer><span title='2015-01-15 04:41:16 +0000 UTC'>January 15, 2015</span></footer><a class=entry-link aria-label="post link to Automating Parse.com bulk data imports" href=https://yanirseroussi.com/2015/01/15/automating-parse-com-bulk-data-imports/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Stochastic Gradient Boosting: Choosing the Best Number of Iterations</h2></header><div class=entry-content><p>Exploring an approach to choosing the optimal number of iterations in stochastic gradient boosting, following a bug I found in scikit-learn.</p></div><footer class=entry-footer><span title='2014-12-29 02:30:06 +0000 UTC'>December 29, 2014</span></footer><a class=entry-link aria-label="post link to Stochastic Gradient Boosting: Choosing the Best Number of Iterations" href=https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>SEO: Mostly about showing up?</h2></header><div class=entry-content><p>Increasing SEO traffic to BCRecommender by adding content and opening up more pages for crawling. It turns out that thin content is better than no content.</p></div><footer class=entry-footer><span title='2014-12-15 04:25:25 +0000 UTC'>December 15, 2014</span></footer><a class=entry-link aria-label="post link to SEO: Mostly about showing up?" href=https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)</h2></header><div class=entry-content><p>Summary of a Kaggle competition to forecast bulldozer sale price, where I finished 9th out of 476 teams.</p></div><footer class=entry-footer><span title='2014-11-19 09:17:34 +0000 UTC'>November 19, 2014</span></footer><a class=entry-link aria-label="post link to Fitting noise: Forecasting the sale price of bulldozers (Kaggle competition summary)" href=https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>BCRecommender Traction Update</h2></header><div class=entry-content><p>Update on BCRecommender traction using three channels: blogger outreach, search engine optimisation, and content marketing.</p></div><footer class=entry-footer><span title='2014-11-05 02:29:35 +0000 UTC'>November 5, 2014</span></footer><a class=entry-link aria-label="post link to BCRecommender Traction Update" href=https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>What is data science?</h2></header><div class=entry-content><p>Data science has been a hot term in the past few years. Still, there isn’t a single definition of the field. This post discusses my favourite definition.</p></div><footer class=entry-footer><span title='2014-10-23 03:22:08 +0000 UTC'>October 23, 2014</span></footer><a class=entry-link aria-label="post link to What is data science?" href=https://yanirseroussi.com/2014/10/23/what-is-data-science/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Greek Media Monitoring Kaggle competition: My approach</h2></header><div class=entry-content><p>Summary of my approach to the Greek Media Monitoring Kaggle competition, where I finished 6th out of 120 teams.</p></div><footer class=entry-footer><span title='2014-10-07 03:21:35 +0000 UTC'>October 7, 2014</span></footer><a class=entry-link aria-label="post link to Greek Media Monitoring Kaggle competition: My approach" href=https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Applying the Traction Book’s Bullseye framework to BCRecommender</h2></header><div class=entry-content><p>Ranking 19 channels with the goal of getting traction for BCRecommender.</p></div><footer class=entry-footer><span title='2014-09-24 04:57:39 +0000 UTC'>September 24, 2014</span></footer><a class=entry-link aria-label="post link to Applying the Traction Book’s Bullseye framework to BCRecommender" href=https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Bandcamp recommendation and discovery algorithms</h2></header><div class=entry-content><p>The recommendation backend for my BCRecommender service for personalised Bandcamp music discovery.</p></div><footer class=entry-footer><span title='2014-09-19 14:26:55 +0000 UTC'>September 19, 2014</span></footer><a class=entry-link aria-label="post link to Bandcamp recommendation and discovery algorithms" href=https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)</h2></header><div class=entry-content><p>Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service.</p></div><footer class=entry-footer><span title='2014-09-07 10:48:44 +0000 UTC'>September 7, 2014</span></footer><a class=entry-link aria-label="post link to Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)" href=https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Building a Bandcamp recommender system (part 1 – motivation)</h2></header><div class=entry-content><p>My motivation behind building BCRecommender, a free recommendation & discovery service for Bandcamp music.</p></div><footer class=entry-footer><span title='2014-08-30 08:11:38 +0000 UTC'>August 30, 2014</span></footer><a class=entry-link aria-label="post link to Building a Bandcamp recommender system (part 1 – motivation)" href=https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>How to (almost) win Kaggle competitions</h2></header><div class=entry-content><p>Summary of a talk I gave at the Data Science Sydney meetup with ten tips on almost-winning Kaggle competitions.</p></div><footer class=entry-footer><span title='2014-08-24 12:40:53 +0000 UTC'>August 24, 2014</span></footer><a class=entry-link aria-label="post link to How to (almost) win Kaggle competitions" href=https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Data’s hierarchy of needs</h2></header><div class=entry-content><p>Discussing the hierarchy of needs proposed by Jay Kreps. Key takeaway: Data-driven algorithms & insights can only be as good as the underlying data.</p></div><footer class=entry-footer><span title='2014-08-17 13:09:30 +0000 UTC'>August 17, 2014</span></footer><a class=entry-link aria-label="post link to Data’s hierarchy of needs" href=https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Kaggle competition tips and summaries</h2></header><div class=entry-content><p>Pointers to all my Kaggle advice posts and competition summaries.</p></div><footer class=entry-footer><span title='2014-04-05 23:46:10 +0000 UTC'>April 5, 2014</span></footer><a class=entry-link aria-label="post link to Kaggle competition tips and summaries" href=https://yanirseroussi.com/kaggle/></a></article><article class=post-entry><header class=entry-header><h2 class=entry-hint-parent>Kaggle beginner tips</h2></header><div class=entry-content><p>First post! An email I sent to members of the Data Science Sydney Meetup with tips on how to get started with Kaggle competitions.</p></div><footer class=entry-footer><span title='2014-01-19 10:34:28 +0000 UTC'>January 19, 2014</span></footer><a class=entry-link aria-label="post link to Kaggle beginner tips" href=https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/></a></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
       <a href=https://github.com/adityatelange/hugo-PaperMod/ rel=noopener target=_blank>PaperMod</a></span></div></div><script>const menuTrigger=document.querySelector("#menu-trigger"),menuElem=document.querySelector(".menu");menuTrigger.addEventListener("click",function(){menuElem.classList.toggle("hidden")}),document.body.addEventListener("click",function(e){menuTrigger.contains(e.target)||menuElem.classList.add("hidden")})</script><script>let menu=document.getElementById("menu");menu&&(menu.scrollLeft=localStorage.getItem("menu-scroll-position"),menu.onscroll=function(){localStorage.setItem("menu-scroll-position",menu.scrollLeft)}),document.querySelectorAll('a[href^="#"]').forEach(e=>{e.addEventListener("click",function(e){e.preventDefault();var t=this.getAttribute("href").substr(1);window.matchMedia("(prefers-reduced-motion: reduce)").matches?document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView():document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView({behavior:"smooth"}),t==="top"?history.replaceState(null,null," "):history.pushState(null,null,`#${t}`)})})</script><script>document.getElementById("theme-toggle").addEventListener("click",()=>{document.body.className.includes("dark")?(document.body.classList.remove("dark"),localStorage.setItem("pref-theme","light")):(document.body.classList.add("dark"),localStorage.setItem("pref-theme","dark"))})</script></body></html>
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index b7295f26e..0ff2eed0d 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>https://yanirseroussi.com/tags/business/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/career/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/quotes/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/startups/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/analytics/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/artificial-intelligence/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-science/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-strategy/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/machine-learning/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/software-engineering/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/</loc><lastmod>2024-06-23T08:52:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/devops/</loc><lastmod>2024-06-23T08:52:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/</loc><lastmod>2024-06-17T13:13:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/books/</loc><lastmod>2024-06-12T12:58:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/</loc><lastmod>2024-06-12T12:58:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/</loc><lastmod>2024-06-10T14:23:12+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/</loc><lastmod>2024-06-03T12:58:00+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-engineering/</loc><lastmod>2024-05-27T12:25:30+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/</loc><lastmod>2024-05-27T12:25:30+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/</loc><lastmod>2024-05-25T10:00:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/futurism/</loc><lastmod>2024-05-25T10:00:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/</loc><lastmod>2024-05-21T17:08:32+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/</loc><lastmod>2024-05-13T12:41:01+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/</loc><lastmod>2024-05-06T14:41:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/consulting/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/personal/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/</loc><lastmod>2024-04-22T17:38:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/</loc><lastmod>2024-04-15T15:54:17+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/linkedin/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/marketing/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/climate-change/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/environment/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/remote-work/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/</loc><lastmod>2024-04-05T11:23:38+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/marine-science/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/reef-life-survey/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/</loc><lastmod>2024-03-12T16:33:48+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/productivity/</loc><lastmod>2024-03-12T16:33:48+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/</loc><lastmod>2024-03-11T15:53:13+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/</loc><lastmod>2024-03-05T08:47:19+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/</loc><lastmod>2024-03-04T12:39:10+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/</loc><lastmod>2024-02-19T11:25:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/</loc><lastmod>2024-02-17T12:34:00+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/</loc><lastmod>2024-02-13T08:24:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/02/06/future-software-development-may-require-fewer-humans/</loc><lastmod>2024-02-06T16:39:35+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/</loc><lastmod>2024-02-19T11:25:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/blogging/</loc><lastmod>2024-01-19T16:35:09+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/</loc><lastmod>2024-01-19T16:35:09+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/</loc><lastmod>2024-01-09T13:23:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/</loc><lastmod>2024-01-08T16:31:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-business/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/</loc><lastmod>2023-12-18T10:38:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/energy-markets/</loc><lastmod>2023-12-14T10:46:41+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/</loc><lastmod>2023-12-14T10:46:41+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-visualisation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/web-development/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/</loc><lastmod>2023-11-21T16:12:27+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/</loc><lastmod>2023-10-06T15:11:27+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/ethics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/</loc><lastmod>2023-09-25T11:15:26+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/</loc><lastmod>2023-09-22T07:54:13+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/</loc><lastmod>2023-08-14T15:44:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/</loc><lastmod>2023-08-11T14:35:20+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/github/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/security/</loc><lastmod>2023-07-25T09:30:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/</loc><lastmod>2023-07-25T09:30:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/hugo/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/</loc><lastmod>2023-07-17T17:18:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/hackers/</loc><lastmod>2024-06-19T17:03:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/</loc><lastmod>2024-06-19T17:03:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/machine-intelligence/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/causal-inference/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/09/12/causal-machine-learning-book-draft-review/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/orkestra/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/politics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/sustainability/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/deep-learning/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/fast.ai/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/01/14/analysis-strategies-in-online-a-b-experiments/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/split-testing/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/statistics/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/cloudflare/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/wordpress/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/10/07/my-work-with-automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bootstrapping/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/confidence-intervals/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/frequently-asked-questions/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bandcamp/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bcrecommender/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/elasticsearch/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/javascript/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/predictive-modelling/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/science-communication/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/search-engine-optimisation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/insights/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/economics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/scuba-diving/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/facebook/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/mongodb/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/health/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/nutrition/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/nutritionism/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/recommender-systems/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/07/31/goodbye-parse-com/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/parse.com/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/deep-learning-resources/</loc><lastmod>2021-11-09T15:38:25+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/sentiment-analysis/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/divestment/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/fossil-fuels/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/phd-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/gradient-boosting/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle-competition/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/01/15/automating-parse-com-bulk-data-imports/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/phantomjs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/scikit-learn/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/traction-book/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/price-forecasting/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/music/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/10/23/what-is-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/multi-label-classification/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/music-industry/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle-beginners/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/kaggle/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/about/</loc><lastmod>2024-03-08T11:21:16+10:00</lastmod></url><url><loc>https://yanirseroussi.com/free-intro-call/</loc><lastmod>2024-05-22T17:54:36+10:00</lastmod></url><url><loc>https://yanirseroussi.com/posts/</loc><lastmod>2024-05-09T10:03:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/causal-inference-resources/</loc><lastmod>2023-07-06T16:01:57+10:00</lastmod></url><url><loc>https://yanirseroussi.com/consult/</loc><lastmod>2024-05-23T15:31:11+10:00</lastmod></url><url><loc>https://yanirseroussi.com/data-to-ai-health-check/</loc><lastmod>2024-05-22T17:53:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/contact/</loc><lastmod>2024-05-23T15:31:11+10:00</lastmod></url><url><loc>https://yanirseroussi.com/talks/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/</loc><lastmod>2024-05-09T10:03:31+10:00</lastmod></url></urlset>
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>https://yanirseroussi.com/tags/business/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/career/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/quotes/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/startups/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/</loc><lastmod>2024-06-26T10:45:15+10:00</lastmod></url><url><loc>https://yanirseroussi.com/</loc><lastmod>2024-06-26T12:57:51+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/analytics/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/artificial-intelligence/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-science/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-strategy/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/24/is-your-tech-stack-ready-for-data-intensive-applications/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/machine-learning/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/software-engineering/</loc><lastmod>2024-06-24T14:12:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/</loc><lastmod>2024-06-23T08:52:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/devops/</loc><lastmod>2024-06-23T08:52:50+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/17/ai-aint-gonna-save-you-from-bad-data/</loc><lastmod>2024-06-17T13:13:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/books/</loc><lastmod>2024-06-12T12:58:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/</loc><lastmod>2024-06-12T12:58:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/10/startup-data-health-starts-with-healthy-event-tracking/</loc><lastmod>2024-06-10T14:23:12+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/06/03/how-to-avoid-startups-with-poor-development-processes/</loc><lastmod>2024-06-03T12:58:00+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-engineering/</loc><lastmod>2024-05-27T12:25:30+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/27/plumbing-decisions-and-automation-de-hyping-data-and-ai/</loc><lastmod>2024-05-27T12:25:30+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/</loc><lastmod>2024-05-25T10:00:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/futurism/</loc><lastmod>2024-05-25T10:00:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/20/question-startup-culture-before-accepting-a-data-to-ai-role/</loc><lastmod>2024-05-21T17:08:32+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/13/probing-the-people-aspects-of-an-early-stage-startup/</loc><lastmod>2024-05-13T12:41:01+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/05/06/business-questions-to-ask-before-taking-a-startup-data-role/</loc><lastmod>2024-05-06T14:41:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/consulting/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/29/mentorship-and-the-art-of-actionable-advice/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/personal/</loc><lastmod>2024-04-29T17:25:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/22/assessing-a-startups-data-to-ai-health/</loc><lastmod>2024-04-22T17:38:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/15/ai-does-not-obviate-the-need-for-testing-and-observability/</loc><lastmod>2024-04-15T15:54:17+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/linkedin/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/marketing/</loc><lastmod>2024-04-11T13:42:58+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/climate-change/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/environment/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/08/my-experience-as-a-data-tech-lead-with-work-on-climate/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/remote-work/</loc><lastmod>2024-04-08T12:13:47+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/</loc><lastmod>2024-04-05T11:23:38+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/04/01/artificial-intelligence-automation-and-the-art-of-counting-fish/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/marine-science/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/reef-life-survey/</loc><lastmod>2024-04-01T17:02:44+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/</loc><lastmod>2024-03-12T16:33:48+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/productivity/</loc><lastmod>2024-03-12T16:33:48+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/03/11/questions-to-consider-when-using-ai-for-pdf-data-extraction/</loc><lastmod>2024-03-11T15:53:13+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/03/04/two-types-of-startup-data-problems/</loc><lastmod>2024-03-05T08:47:19+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/26/avoiding-ai-complexity-first-write-no-code/</loc><lastmod>2024-03-04T12:39:10+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/19/building-your-startups-minimum-viable-data-stack/</loc><lastmod>2024-02-19T11:25:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/</loc><lastmod>2024-02-17T12:34:00+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/12/nudging-chatgpt-to-invent-books-you-have-no-time-to-read/</loc><lastmod>2024-02-13T08:24:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/02/06/future-software-development-may-require-fewer-humans/</loc><lastmod>2024-02-06T16:39:35+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/02/05/substance-over-titles-your-first-data-hire-may-be-a-data-scientist/</loc><lastmod>2024-02-19T11:25:54+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/blogging/</loc><lastmod>2024-01-19T16:35:09+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/</loc><lastmod>2024-01-19T16:35:09+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/</loc><lastmod>2024-01-09T13:23:28+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/</loc><lastmod>2024-01-08T16:31:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-business/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/</loc><lastmod>2023-12-18T10:38:56+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/energy-markets/</loc><lastmod>2023-12-14T10:46:41+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/</loc><lastmod>2023-12-14T10:46:41+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/data-visualisation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/web-development/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/</loc><lastmod>2023-11-21T16:12:27+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/</loc><lastmod>2023-10-06T15:11:27+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/ethics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/</loc><lastmod>2023-09-25T11:15:26+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/</loc><lastmod>2023-09-22T07:54:13+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/08/28/my-rediscovery-of-quiet-writing-on-the-open-web/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/</loc><lastmod>2023-08-14T15:44:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/</loc><lastmod>2023-08-11T14:35:20+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/github/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/security/</loc><lastmod>2023-07-25T09:30:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/</loc><lastmod>2023-07-25T09:30:43+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/hugo/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/</loc><lastmod>2023-07-17T17:18:06+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/</loc><lastmod>2024-03-12T16:33:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/hackers/</loc><lastmod>2024-06-19T17:03:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/05/26/how-hackable-are-automated-coding-assessments/</loc><lastmod>2024-06-19T17:03:21+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/machine-intelligence/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/12/11/chatgpt-is-transformative-ai/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/causal-inference/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/09/12/causal-machine-learning-book-draft-review/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/orkestra/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/politics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/sustainability/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/03/20/building-useful-machine-learning-tools-keeps-getting-easier-a-fish-id-case-study/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/deep-learning/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/fast.ai/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2022/01/14/analysis-strategies-in-online-a-b-experiments/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/split-testing/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/statistics/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/11/22/use-your-human-brain-to-avoid-artificial-intelligence-disasters/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/cloudflare/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/wordpress/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/10/07/my-work-with-automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2021/04/05/some-highlights-from-2020/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bootstrapping/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/confidence-intervals/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2020/08/24/many-is-not-enough-counting-simulations-to-bootstrap-the-right-way/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/12/12/a-day-in-the-life-of-a-remote-data-scientist/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/10/06/bootstrapping-the-right-way/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/12/24/the-most-practical-causal-inference-book-ive-read-is-still-a-draft/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/11/03/reflections-on-remote-data-science-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2018/07/22/defining-data-science-in-2018/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/10/15/advice-for-aspiring-data-scientists-and-other-faqs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/frequently-asked-questions/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bandcamp/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/bcrecommender/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/elasticsearch/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/07/29/my-10-step-path-to-becoming-a-remote-data-scientist-with-automattic/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/06/03/exploring-and-visualising-reef-life-survey-data/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/javascript/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2017/01/08/customer-lifetime-value-and-the-proliferation-of-misinformation-on-the-internet/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/predictive-modelling/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/science-communication/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/search-engine-optimisation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/insights/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/08/21/seven-ways-to-be-data-driven-off-a-cliff/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/08/04/is-data-scientist-a-useless-job-title/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/06/19/making-bayesian-ab-testing-more-accessible/</loc><lastmod>2024-02-21T11:52:55+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/05/15/diving-deeper-into-causality-pearl-kleinberg-hill-and-untested-assumptions/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/economics/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/03/20/the-rise-of-greedy-robots/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/scuba-diving/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2016/01/24/the-joys-of-offline-data-collection/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/facebook/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/12/08/this-holiday-season-give-me-real-insights/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/11/23/the-hardest-parts-of-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/11/04/migrating-a-simple-web-application-from-mongodb-to-elasticsearch/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/mongodb/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/health/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/10/19/nutritionism-and-the-need-for-complex-models-to-explain-complex-phenomena/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/nutrition/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/nutritionism/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/recommender-systems/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/10/02/the-wonderful-world-of-recommender-systems/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/08/24/you-dont-need-a-data-scientist-yet/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/07/31/goodbye-parse-com/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/parse.com/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/07/06/learning-about-deep-learning-through-album-cover-classification/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/deep-learning-resources/</loc><lastmod>2021-11-09T15:38:25+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/06/06/hopping-on-the-deep-learning-bandwagon/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/sentiment-analysis/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/divestment/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/fossil-fuels/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/04/24/my-divestment-from-fossil-fuels/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/phd-work/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/gradient-boosting/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle-competition/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/02/11/learning-to-rank-for-personalised-search-yandex-search-personalisation-kaggle-competition-summary-part-2/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/01/29/is-thinking-like-a-search-engine-possible-yandex-search-personalisation-kaggle-competition-summary-part-1/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2015/01/15/automating-parse-com-bulk-data-imports/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/phantomjs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/scikit-learn/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/12/29/stochastic-gradient-boosting-choosing-the-best-number-of-iterations/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/12/15/seo-mostly-about-showing-up/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/traction-book/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/11/19/fitting-noise-forecasting-the-sale-price-of-bulldozers-kaggle-competition-summary/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/price-forecasting/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/11/05/bcrecommender-traction-update/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/music/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/10/23/what-is-data-science/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/10/07/greek-media-monitoring-kaggle-competition-my-approach/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/multi-label-classification/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/24/applying-the-traction-books-bullseye-framework-to-bcrecommender/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/19/bandcamp-recommendation-and-discovery-algorithms/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/09/07/building-a-recommender-system-on-a-shoestring-budget/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/30/building-a-bandcamp-recommender-system-part-1-motivation/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/music-industry/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/24/how-to-almost-win-kaggle-competitions/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/tags/kaggle-beginners/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/08/17/datas-hierarchy-of-needs/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/kaggle/</loc><lastmod>2024-01-16T09:56:03+10:00</lastmod></url><url><loc>https://yanirseroussi.com/2014/01/19/kaggle-beginner-tips/</loc><lastmod>2023-07-06T09:28:02+10:00</lastmod></url><url><loc>https://yanirseroussi.com/about/</loc><lastmod>2024-03-08T11:21:16+10:00</lastmod></url><url><loc>https://yanirseroussi.com/free-intro-call/</loc><lastmod>2024-06-26T12:57:51+10:00</lastmod></url><url><loc>https://yanirseroussi.com/posts/</loc><lastmod>2024-05-09T10:03:31+10:00</lastmod></url><url><loc>https://yanirseroussi.com/causal-inference-resources/</loc><lastmod>2023-07-06T16:01:57+10:00</lastmod></url><url><loc>https://yanirseroussi.com/consult/</loc><lastmod>2024-05-23T15:31:11+10:00</lastmod></url><url><loc>https://yanirseroussi.com/data-to-ai-health-check/</loc><lastmod>2024-06-26T12:57:51+10:00</lastmod></url><url><loc>https://yanirseroussi.com/contact/</loc><lastmod>2024-05-23T15:31:11+10:00</lastmod></url><url><loc>https://yanirseroussi.com/talks/</loc><lastmod>2024-05-06T16:35:22+10:00</lastmod></url><url><loc>https://yanirseroussi.com/til/</loc><lastmod>2024-05-09T10:03:31+10:00</lastmod></url></urlset>
\ No newline at end of file
diff --git a/til/2023/07/11/you-cant-save-time/index.html b/til/2023/07/11/you-cant-save-time/index.html
index de2a12625..b7c892420 100644
--- a/til/2023/07/11/you-cant-save-time/index.html
+++ b/til/2023/07/11/you-cant-save-time/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,personal,quotes"><meta name=description content="Time can be spent doing different activities, but it can&rsquo;t be stored and saved for later."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="You can't save time"><meta property="og:description" content="Time can be spent doing different activities, but it can&rsquo;t be stored and saved for later."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-07-11T00:00:00+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:31+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="You can't save time"><meta name=twitter:description content="Time can be spent doing different activities, but it can&rsquo;t be stored and saved for later."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"You can't save time","item":"https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"You can't save time","name":"You can\u0027t save time","description":"Time can be spent doing different activities, but it can\u0026rsquo;t be stored and saved for later.","keywords":["books","personal","quotes"],"articleBody":"Quoting How to Speak Whale: A Voyage into the Future of Animal Communication by Tom Mustill (wonderful book!):\nI was gobsmacked. A whale lands on you and disappears. End of story. But thanks to lots of people who liked looking at whales and their intelligent machines, it was not the end at all. Machine learning and other branches of AI influence our daily lives in myriad ways. They have helped this book come into existence, with an algorithm transcribing the hundreds of hours of interviews I conducted for it. Other algorithms have checked my spelling and finished my sentences for me as I typed them. Google’s effective prediction of my email responses has made me realize how predictable a lot of my writing is (sorry, reader), and by extension, perhaps, most human language is. It has saved me huge amounts of time, and I have ended up spending this saved time procrastinating by looking at my phone, at news apps and shopping sites and social media, all of which have been beautifully designed and pumped full of AI whose purpose is to suck up my time and money and data.\nRelated:\nThe concept of time thrift/discipline, which I learned about from The WEIRDest People in the World by Joseph Henrich. Jevons paradox, which I learned about from The Ministry for the Future by Kim Stanley Robinson. ","wordCount":"226","inLanguage":"en","datePublished":"2023-07-11T00:00:00Z","dateModified":"2024-03-12T16:33:31+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">You can't save time</h1><div class=post-meta><span title='2023-07-11 00:00:00 +0000 UTC'>July 11, 2023</span></div></header><div class=post-content><p>Quoting <a href=https://www.tommustill.com/how-to-speak-whale target=_blank rel=noopener>How to Speak Whale: A Voyage into the Future of Animal Communication</a> by Tom Mustill (wonderful book!):</p><blockquote><p>I was gobsmacked. A whale lands on you and disappears. End of story. But thanks to lots of people who liked looking at whales and their intelligent machines, it was not the end at all. Machine learning and other branches of AI influence our daily lives in myriad ways. They have helped this book come into existence, with an algorithm transcribing the hundreds of hours of interviews I conducted for it. Other algorithms have checked my spelling and finished my sentences for me as I typed them. Google&rsquo;s effective prediction of my email responses has made me realize how predictable a lot of my writing is (sorry, reader), and by extension, perhaps, most human language is. It has saved me huge amounts of time, and I have ended up spending this saved time procrastinating by looking at my phone, at news apps and shopping sites and social media, all of which have been beautifully designed and pumped full of AI whose purpose is to suck up my time and money and data.</p></blockquote><p>Related:</p><ul><li>The concept of <a href=https://en.wikipedia.org/wiki/Time_discipline target=_blank rel=noopener>time thrift/discipline</a>, which I learned about from <a href=https://en.wikipedia.org/wiki/The_WEIRDest_People_in_the_World target=_blank rel=noopener>The WEIRDest People in the World</a> by Joseph Henrich.</li><li><a href=https://en.wikipedia.org/wiki/Jevons_paradox target=_blank rel=noopener>Jevons paradox</a>, which I learned about from <a href=https://en.wikipedia.org/wiki/The_Ministry_for_the_Future target=_blank rel=noopener>The Ministry for the Future</a> by Kim Stanley Robinson.</li></ul></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on x" href="https://x.com/intent/tweet/?text=You%20can%27t%20save%20time&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f&amp;hashtags=books%2cpersonal%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f&amp;title=You%20can%27t%20save%20time&amp;summary=You%20can%27t%20save%20time&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f&title=You%20can%27t%20save%20time"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on whatsapp" href="https://api.whatsapp.com/send?text=You%20can%27t%20save%20time%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on telegram" href="https://telegram.me/share/url?text=You%20can%27t%20save%20time&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You can't save time on ycombinator" href="https://news.ycombinator.com/submitlink?t=You%20can%27t%20save%20time&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f11%2fyou-cant-save-time%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/index.html b/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/index.html
index a1fad6733..d37c66ef7 100644
--- a/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/index.html
+++ b/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="blogging,Hugo,web development"><meta name=description content="How I added a Today I Learned section to my Hugo site with the PaperMod theme."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Making a TIL section with Hugo and PaperMod"><meta property="og:description" content="How I added a Today I Learned section to my Hugo site with the PaperMod theme."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-07-17T00:06:15+00:00"><meta property="article:modified_time" content="2023-07-17T17:18:06+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Making a TIL section with Hugo and PaperMod"><meta name=twitter:description content="How I added a Today I Learned section to my Hugo site with the PaperMod theme."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Making a TIL section with Hugo and PaperMod","item":"https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Making a TIL section with Hugo and PaperMod","name":"Making a TIL section with Hugo and PaperMod","description":"How I added a Today I Learned section to my Hugo site with the PaperMod theme.","keywords":["blogging","Hugo","web development"],"articleBody":"I started following Simon Willison as a way of getting informed about practical applications of large language models. It turns out that Simon is also an incredibly prolific blogger and open source contributor, with posts dating back to 2002. He recently gave an interview titled The Data Enthusiast’s Toolkit, where he discussed his Datasette project and a bit about his approach to writing online. In addition to his main blog, he maintains a separate TIL subdomain, where shares things he’s learned in quick rough posts.\nMost of my main website posts take a while to write, so I post infrequently. I enjoy the process of learning about topics through public writing, as I often start out thinking things are a certain way, and then make new discoveries when I search for references. However, I liked the idea of sharing more through quicker posts, so I decided to add a TIL section to my site.\nI made the switch from WordPress.com to Hugo almost two years ago and haven’t looked back. I love the extra control that it gives me, though it sometimes requires a bit of tinkering. However, adding a TIL section was a breeze thanks to a post by Jacob Kaplan-Moss. Since I use the PaperMod theme and my site is set up differently, I followed slightly different steps (links are to commits in the PR that added the TIL section):\nCopied the PaperMod’s archives.html to layout/til/list.html to serve as the base list layout under /til/. Tweaked the new list.html to only show posts of type til, and removed counts and drafts. Added content/til/_index.md with the section’s title and description, which are shown at the top of the list view and in meta tags. Added my first TIL post with a lovely quote from How to Speak Whale. Unlike Jacob, I don’t mind TIL posts showing up in the main RSS feed. It’s enough for me that they don’t show up on the main page, which didn’t require any configuration tweaks in my case.\nWhen I push changes to the website, I like checking the gh-pages branch to see what got deployed. Recently I added an additional gh-pages-unminified branch to produce a more human-friendly diff. You can see the human-friendly result of merging the PR that added the TIL section here.\nThat’s it for now. If I post TILs consistently, I will probably link to it from the top-level menu and tweak the /til/ page to also show posts by tag. For now, I like the idea of a quiet low-effort place to post publicly.\n","wordCount":"426","inLanguage":"en","datePublished":"2023-07-17T00:06:15Z","dateModified":"2023-07-17T17:18:06+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Making a TIL section with Hugo and PaperMod</h1><div class=post-meta><span title='2023-07-17 00:06:15 +0000 UTC'>July 17, 2023</span></div></header><div class=post-content><p>I started following <a href=https://simonwillison.net/ target=_blank rel=noopener>Simon Willison</a> as a way of getting informed about practical applications of large language models. It turns out that Simon is also an incredibly prolific blogger and open source contributor, with posts dating back to 2002. He recently gave an interview titled <a href="https://www.youtube.com/watch?v=zI43eaPc59Q" target=_blank rel=noopener>The Data Enthusiast&rsquo;s Toolkit</a>, where he discussed his <a href=https://datasette.io/ target=_blank rel=noopener>Datasette project</a> and a bit about his approach to writing online. In addition to his main blog, he maintains <a href=https://til.simonwillison.net/ target=_blank rel=noopener>a separate TIL subdomain</a>, where shares things he&rsquo;s learned in quick rough posts.</p><p>Most of <a href=https://yanirseroussi.com/>my main website posts</a> take a while to write, so I post infrequently. I enjoy the process of learning about topics through public writing, as I often start out thinking things are a certain way, and then make new discoveries when I search for references. However, I liked the idea of sharing more through quicker posts, so I decided to add a TIL section to my site.</p><p><a href=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/>I made the switch from WordPress.com to Hugo almost two years ago</a> and haven&rsquo;t looked back. I love the extra control that it gives me, though it sometimes requires a bit of tinkering. However, adding a TIL section was a breeze thanks to <a href=https://jacobian.org/til/how-to-make-a-til-section/ target=_blank rel=noopener>a post by Jacob Kaplan-Moss</a>. Since I use the PaperMod theme and my site is set up differently, I followed slightly different steps (links are to <a href=https://github.com/yanirs/yanirseroussi.com/pull/7 target=_blank rel=noopener>commits in the PR that added the TIL section</a>):</p><ol><li><a href=https://github.com/yanirs/yanirseroussi.com/pull/7/commits/a2eb33e3d9b96e6c3eca1ad9475471e60765b019 target=_blank rel=noopener>Copied the PaperMod&rsquo;s <code>archives.html</code> to <code>layout/til/list.html</code></a> to serve as <a href=https://yanirseroussi.com/til/>the base list layout under <code>/til/</code></a>.</li><li><a href=https://github.com/yanirs/yanirseroussi.com/pull/7/commits/b0859b5b1f5a936d0e73ce98a6b1565cb36b4832 target=_blank rel=noopener>Tweaked the new <code>list.html</code> to only show posts of type <code>til</code></a>, and removed counts and drafts.</li><li><a href=https://github.com/yanirs/yanirseroussi.com/pull/7/commits/e6ce0aed4bb7bcbbdbe913db0e786447b6497076 target=_blank rel=noopener>Added <code>content/til/_index.md</code> with the section&rsquo;s title and description</a>, which are shown at the top of the list view and in meta tags.</li><li><a href=https://github.com/yanirs/yanirseroussi.com/pull/7/commits/3841a1009f3749a208b1a53200fb0484951f2ca8 target=_blank rel=noopener>Added my first TIL post</a> with <a href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/>a lovely quote from <em>How to Speak Whale</em></a>.</li></ol><p>Unlike Jacob, I don&rsquo;t mind TIL posts showing up in the main RSS feed. It&rsquo;s enough for me that they don&rsquo;t show up on the main page, which didn&rsquo;t require any configuration tweaks in my case.</p><p>When I push changes to the website, I like checking the <code>gh-pages</code> branch to see what got deployed. Recently <a href=https://github.com/yanirs/yanirseroussi.com/commit/485a4e9960d513ad53a06774ba01daae349b409c target=_blank rel=noopener>I added an additional <code>gh-pages-unminified</code> branch</a> to produce a more human-friendly diff. You can see the human-friendly result of merging the PR that added the TIL section <a href=https://github.com/yanirs/yanirseroussi.com/commit/ed3a7ab3df061c2ddce95ff3803674ee67941e06 target=_blank rel=noopener>here</a>.</p><p>That&rsquo;s it for now. If I post TILs consistently, I will probably link to it from the top-level menu and tweak the <code>/til/</code> page to also show posts by tag. For now, I like the idea of a quiet low-effort place to post publicly.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/blogging/>Blogging</a></li><li><a href=https://yanirseroussi.com/tags/hugo/>Hugo</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on x" href="https://x.com/intent/tweet/?text=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f&amp;hashtags=blogging%2cHugo%2cwebdevelopment"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f&amp;title=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod&amp;summary=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f&title=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on whatsapp" href="https://api.whatsapp.com/send?text=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on telegram" href="https://telegram.me/share/url?text=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Making a TIL section with Hugo and PaperMod on ycombinator" href="https://news.ycombinator.com/submitlink?t=Making%20a%20TIL%20section%20with%20Hugo%20and%20PaperMod&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f17%2fmaking-a-til-section-with-hugo-and-papermod%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/07/23/using-yubikey-for-ssh-access/index.html b/til/2023/07/23/using-yubikey-for-ssh-access/index.html
index f8627ee41..34a7815cd 100644
--- a/til/2023/07/23/using-yubikey-for-ssh-access/index.html
+++ b/til/2023/07/23/using-yubikey-for-ssh-access/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="DevOps,GitHub,security"><meta name=description content="Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Using YubiKey for SSH access"><meta property="og:description" content="Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-07-23T00:07:15+00:00"><meta property="article:modified_time" content="2023-07-25T09:30:43+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Using YubiKey for SSH access"><meta name=twitter:description content="Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Using YubiKey for SSH access","item":"https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Using YubiKey for SSH access","name":"Using YubiKey for SSH access","description":"Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04.","keywords":["DevOps","GitHub","security"],"articleBody":"I’ve been getting increasingly paranoid about computer security over the years. There’s always plenty to improve as new threats and technologies keep evolving. For example, dependency chain compromises like the one disclosed by PyTorch in December 2022 show that even a seemingly-benign action like installing a well-known package can result in exposure of secrets, including private SSH keys.\nAs part of improving my security stance, I bought a couple of YubiKeys a couple of years ago and started using them wherever possible (either directly on supported sites or indirectly via Yubico Authenticator). At some point, I realised that YubiKeys may be used for SSH access as well, but I only got around to setting it up today. Turns out it’s pretty simple.\nOne problem with YubiKeys is that they often offer more than one way of doing things, and SSH access is no exception. Andrej Friesen covered the options for SSH in an accessible post, which led me to the official Yubico page on securing SSH with FIDO2. I went with the discoverable key option along with ed25519 as the algorithm (see this post for a short explanation of the different algorithms).\nFollowing the official guide was straightforward, but then I hit this error: sign_and_send_pubkey: signing failed for ED25519-SK \"...\" from agent: agent refused operation. The common cause of this error is having the wrong permissions on the private key file, but that wasn’t the case for me. After a bit of digging, I found this Reddit thread that points to a gnome-keyring issue as the root problem (I use Ubuntu 22.04 on my laptop). As suggested on the Reddit thread, adding IdentityAgent none to the relevant hosts in my SSH config sorted out the issue. It seems like a reasonable workaround for now – I’m happy that I now have my SSH access tied to my YubiKey.\nEdit: Connection multiplexing can be useful for reducing the need to re-authenticate when running parallel connections.\n","wordCount":"324","inLanguage":"en","datePublished":"2023-07-23T00:07:15Z","dateModified":"2023-07-25T09:30:43+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Using YubiKey for SSH access</h1><div class=post-meta><span title='2023-07-23 00:07:15 +0000 UTC'>July 23, 2023</span></div></header><div class=post-content><p>I&rsquo;ve been getting increasingly paranoid about computer security over the years. There&rsquo;s always plenty to improve as new threats and technologies keep evolving. For example, dependency chain compromises like <a href=https://www.bleepingcomputer.com/news/security/pytorch-discloses-malicious-dependency-chain-compromise-over-holidays/ target=_blank rel=noopener>the one disclosed by PyTorch in December 2022</a> show that even a seemingly-benign action like installing a well-known package can result in exposure of secrets, including private SSH keys.</p><p>As part of improving my security stance, I bought a couple of YubiKeys a couple of years ago and started using them wherever possible (either directly on supported sites or indirectly via <a href=https://www.yubico.com/products/yubico-authenticator/ target=_blank rel=noopener>Yubico Authenticator</a>). At some point, I realised that YubiKeys may be used for SSH access as well, but I only got around to setting it up today. Turns out it&rsquo;s pretty simple.</p><p>One problem with YubiKeys is that they often offer more than one way of doing things, and SSH access is no exception. <a href=https://www.ajfriesen.com/yubikey-ssh-key/ target=_blank rel=noopener>Andrej Friesen covered the options for SSH in an accessible post</a>, which led me to <a href=https://developers.yubico.com/SSH/Securing_SSH_with_FIDO2.html target=_blank rel=noopener>the official Yubico page on securing SSH with FIDO2</a>. I went with the discoverable key option along with <code>ed25519</code> as the algorithm (see <a href=https://www.cryptsus.com/blog/how-to-secure-your-ssh-server-with-public-key-elliptic-curve-ed25519-crypto.html target=_blank rel=noopener>this post</a> for a short explanation of the different algorithms).</p><p>Following the official guide was straightforward, but then I hit this error: <code>sign_and_send_pubkey: signing failed for ED25519-SK "..." from agent: agent refused operation</code>. The common cause of this error is having the wrong permissions on the private key file, but that wasn&rsquo;t the case for me. After a bit of digging, I found <a href=https://www.reddit.com/r/yubikey/comments/wip57i/ubuntu_ssh_sign_and_send_pubkey_signing_failed/ target=_blank rel=noopener>this Reddit thread</a> that points to a <a href=https://gitlab.gnome.org/GNOME/gnome-keyring/-/issues/101 target=_blank rel=noopener>gnome-keyring issue as the root problem</a> (I use Ubuntu 22.04 on my laptop). As suggested on the Reddit thread, adding <code>IdentityAgent none</code> to the relevant hosts in my SSH config sorted out the issue. It seems like a reasonable workaround for now – I&rsquo;m happy that I now have my SSH access tied to my YubiKey.</p><p><em>Edit</em>: <a href=https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Multiplexing target=_blank rel=noopener>Connection multiplexing</a> can be useful for reducing the need to re-authenticate when running parallel connections.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/github/>GitHub</a></li><li><a href=https://yanirseroussi.com/tags/security/>Security</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on x" href="https://x.com/intent/tweet/?text=Using%20YubiKey%20for%20SSH%20access&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f&amp;hashtags=DevOps%2cGitHub%2csecurity"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f&amp;title=Using%20YubiKey%20for%20SSH%20access&amp;summary=Using%20YubiKey%20for%20SSH%20access&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f&title=Using%20YubiKey%20for%20SSH%20access"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on whatsapp" href="https://api.whatsapp.com/send?text=Using%20YubiKey%20for%20SSH%20access%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on telegram" href="https://telegram.me/share/url?text=Using%20YubiKey%20for%20SSH%20access&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Using YubiKey for SSH access on ycombinator" href="https://news.ycombinator.com/submitlink?t=Using%20YubiKey%20for%20SSH%20access&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f07%2f23%2fusing-yubikey-for-ssh-access%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/index.html b/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/index.html
index 86a65491a..df4588b7a 100644
--- a/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/index.html
+++ b/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data visualisation"><meta name=description content="Turns out that the rule of thirds for composing visuals may not be that important."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The rule of thirds can probably be ignored"><meta property="og:description" content="Turns out that the rule of thirds for composing visuals may not be that important."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-08-11T03:15:00+00:00"><meta property="article:modified_time" content="2023-08-11T14:35:20+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The rule of thirds can probably be ignored"><meta name=twitter:description content="Turns out that the rule of thirds for composing visuals may not be that important."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The rule of thirds can probably be ignored","item":"https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The rule of thirds can probably be ignored","name":"The rule of thirds can probably be ignored","description":"Turns out that the rule of thirds for composing visuals may not be that important.","keywords":["data visualisation"],"articleBody":"I recently read Cole Nussbaumer Knaflic’s Storytelling with You. Building on the success of Cole’s previous books on data visualisation, Storytelling with You contains detailed strategies on how to plan, create, and deliver compelling presentations. As I have a talk coming up at DataEngBytes Brisbane, I figured it was worth reading to sharpen my presentation skills.\nWhile the book contains many tips that I am already familiar with (e.g., overloading every slide with dense bullet points is not a good idea), one thing that was new to me was the rule of thirds for composing images. As described on Wikipedia, the rule of thirds “proposes that an image should be imagined as divided into nine equal parts by two equally spaced horizontal lines and two equally spaced vertical lines, and that important compositional elements should be placed along these lines or their intersections.”\nIt’s nice to have rules to follow, but it seems like the rule of thirds was pretty much made up over two hundred years ago (see the Wikipedia article for historical details). Indeed, a recent study found that “for photographs that were rated as highly aesthetic and for a large set of paintings, calculated ROT [rule-of-thirds] values were about as low as in photographs that did not follow the rule of thirds.” And that “the rule of thirds seems to play only a minor, if any, role in large sets of high-quality photographs and paintings.” Similarly, an article from Adobe also states that the “rule” doesn’t have to be followed for a photo to be successful.\nThat said, Cole notes that when using an image that fills the slide, following the rule of thirds leaves a bit more space for overlaying text. For example, I accidentally benefited from using a stock photo that followed the rule back when I built a recommender system for music from Bandcamp (see the cover photo there for a partial snapshot). In general, it’s worth being mindful about the composition of slides and other visual elements, which is where knowledge of rules of thumb can be useful. But when it comes to things that can be tested, like high-traffic websites, rigorously experimenting with positioning and composition may be the best approach.\n","wordCount":"369","inLanguage":"en","datePublished":"2023-08-11T03:15:00Z","dateModified":"2023-08-11T14:35:20+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The rule of thirds can probably be ignored</h1><div class=post-meta><span title='2023-08-11 03:15:00 +0000 UTC'>August 11, 2023</span></div></header><div class=post-content><p>I recently read <a href=https://www.storytellingwithyou.com/ target=_blank rel=noopener>Cole Nussbaumer Knaflic&rsquo;s <em>Storytelling with You</em></a>. Building on the success of Cole&rsquo;s previous books on data visualisation, <em>Storytelling with You</em> contains detailed strategies on how to plan, create, and deliver compelling presentations. As I have a talk coming up at <a href=https://dataengconf.com.au/ target=_blank rel=noopener>DataEngBytes Brisbane</a>, I figured it was worth reading to sharpen my presentation skills.</p><p>While the book contains many tips that I am already familiar with (e.g., overloading every slide with dense bullet points is not a good idea), one thing that was new to me was the <em>rule of thirds</em> for composing images. As described on Wikipedia, <a href=https://en.wikipedia.org/wiki/Rule_of_thirds target=_blank rel=noopener>the rule of thirds</a> <em>&ldquo;proposes that an image should be imagined as divided into nine equal parts by two equally spaced horizontal lines and two equally spaced vertical lines, and that important compositional elements should be placed along these lines or their intersections.&rdquo;</em></p><p>It&rsquo;s nice to have rules to follow, but it seems like the rule of thirds was pretty much made up over two hundred years ago (see the Wikipedia article for historical details). Indeed, <a href="https://brill.com/view/journals/artp/2/1-2/article-p163_11.xml?language=en&amp;ebody=full%20html-copy1" target=_blank rel=noopener>a recent study</a> found that <em>&ldquo;for photographs that were rated as highly aesthetic and for a large set of paintings, calculated ROT [rule-of-thirds] values were about as low as in photographs that did not follow the rule of thirds.&rdquo;</em> And that <em>&ldquo;the rule of thirds seems to play only a minor, if any, role in large sets of high-quality photographs and paintings.&rdquo;</em> Similarly, <a href=https://www.adobe.com/au/creativecloud/photography/discover/rule-of-thirds.html target=_blank rel=noopener>an article from Adobe</a> also states that the &ldquo;rule&rdquo; doesn&rsquo;t have to be followed for a photo to be successful.</p><p>That said, Cole notes that when using an image that fills the slide, following the rule of thirds leaves a bit more space for overlaying text. For example, I accidentally benefited from using a stock photo that followed the rule back when I built <a href=https://yanirseroussi.com/2017/09/02/state-of-bandcamp-recommender/>a recommender system for music from Bandcamp</a> (see the cover photo there for a partial snapshot). In general, it&rsquo;s worth being mindful about the composition of slides and other visual elements, which is where knowledge of rules of thumb can be useful. But when it comes to things that can be tested, like high-traffic websites, rigorously experimenting with positioning and composition may be the best approach.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-visualisation/>Data Visualisation</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on x" href="https://x.com/intent/tweet/?text=The%20rule%20of%20thirds%20can%20probably%20be%20ignored&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f&amp;hashtags=datavisualisation"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f&amp;title=The%20rule%20of%20thirds%20can%20probably%20be%20ignored&amp;summary=The%20rule%20of%20thirds%20can%20probably%20be%20ignored&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f&title=The%20rule%20of%20thirds%20can%20probably%20be%20ignored"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on whatsapp" href="https://api.whatsapp.com/send?text=The%20rule%20of%20thirds%20can%20probably%20be%20ignored%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on telegram" href="https://telegram.me/share/url?text=The%20rule%20of%20thirds%20can%20probably%20be%20ignored&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rule of thirds can probably be ignored on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20rule%20of%20thirds%20can%20probably%20be%20ignored&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f11%2fthe-rule-of-thirds-can-probably-be-ignored%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/08/14/email-notifications-on-public-github-commits/index.html b/til/2023/08/14/email-notifications-on-public-github-commits/index.html
index 24ed002db..f2373293d 100644
--- a/til/2023/08/14/email-notifications-on-public-github-commits/index.html
+++ b/til/2023/08/14/email-notifications-on-public-github-commits/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="productivity,Reef Life Survey"><meta name=description content="GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Email notifications on public GitHub commits"><meta property="og:description" content="GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-08-14T05:15:00+00:00"><meta property="article:modified_time" content="2023-08-14T15:44:21+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Email notifications on public GitHub commits"><meta name=twitter:description content="GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Email notifications on public GitHub commits","item":"https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Email notifications on public GitHub commits","name":"Email notifications on public GitHub commits","description":"GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits.","keywords":["productivity","Reef Life Survey"],"articleBody":"I have a daily data processing workflow that runs as a GitHub Actions cron job on the rls-data repository. The workflow turns some public Reef Life Survey data into JSONs that are used by tools on the Reef Life Survey website. Given that the repository is public, running the workflow is free.\nWhen I first implemented the workflow, it was running once a week, so I used GitHub’s settings to get notified every time any Actions workflow ran. This included successful runs, which wasn’t too noisy. However, I received many other emails due to Actions that ran on other projects I was working on.\nIt was on my list to figure out a better way, but I only got around to it when I increased the flow’s frequency to daily. I still wanted to receive an email if the flow failed or made changes to the output JSONs, but there was no need for an email if it ran successfully without committing any changes.\nWhile GitHub supports getting emails on new pull requests by watching a repository, it doesn’t appear to support subscribing to commits. Fortunately, Stack Overflow has the answer for watching commits: Use the built-in RSS feed by appending /commits/.atom to the repository’s URL. In rls-data’s case, it is https://github.com/yanirs/rls-data/commits/master.atom, which I turned to emails using Blogtrottr.\nProblem solved! Now I only get emails on commits via Blogtrottr, along with emails on workflow failures directly from GitHub.\n","wordCount":"239","inLanguage":"en","datePublished":"2023-08-14T05:15:00Z","dateModified":"2023-08-14T15:44:21+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Email notifications on public GitHub commits</h1><div class=post-meta><span title='2023-08-14 05:15:00 +0000 UTC'>August 14, 2023</span></div></header><div class=post-content><p>I have a daily data processing workflow that runs as a GitHub Actions cron job on <a href=https://github.com/yanirs/rls-data/ target=_blank rel=noopener>the <code>rls-data</code> repository</a>. The workflow turns some public <a href=https://reeflifesurvey.com/ target=_blank rel=noopener>Reef Life Survey</a> data into JSONs that are used by tools on the Reef Life Survey website. Given that the repository is public, running the workflow is free.</p><p>When I first implemented the workflow, it was running once a week, so I used GitHub&rsquo;s settings to get notified every time <em>any</em> Actions workflow ran. This included successful runs, which wasn&rsquo;t too noisy. However, I received many other emails due to Actions that ran on other projects I was working on.</p><p>It was on my list to figure out a better way, but I only got around to it when I increased the flow&rsquo;s frequency to daily. I still wanted to receive an email if the flow failed or made changes to the output JSONs, but there was no need for an email if it ran successfully without committing any changes.</p><p>While GitHub supports getting emails on new pull requests by watching a repository, it doesn&rsquo;t appear to support subscribing to commits. Fortunately, <a href=https://stackoverflow.com/questions/9845655/how-do-i-get-notifications-for-commits-to-a-github-repository target=_blank rel=noopener>Stack Overflow has the answer for watching commits</a>: Use the built-in RSS feed by appending <code>/commits/&lt;branch>.atom</code> to the repository&rsquo;s URL. In <code>rls-data</code>&rsquo;s case, it is <a href=https://github.com/yanirs/rls-data/commits/master.atom target=_blank rel=noopener>https://github.com/yanirs/rls-data/commits/master.atom</a>, which I turned to emails using <a href=https://blogtrottr.com/ target=_blank rel=noopener>Blogtrottr</a>.</p><p>Problem solved! Now I only get emails on commits via Blogtrottr, along with emails on workflow failures directly from GitHub.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/productivity/>Productivity</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on x" href="https://x.com/intent/tweet/?text=Email%20notifications%20on%20public%20GitHub%20commits&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f&amp;hashtags=productivity%2cReefLifeSurvey"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f&amp;title=Email%20notifications%20on%20public%20GitHub%20commits&amp;summary=Email%20notifications%20on%20public%20GitHub%20commits&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f&title=Email%20notifications%20on%20public%20GitHub%20commits"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on whatsapp" href="https://api.whatsapp.com/send?text=Email%20notifications%20on%20public%20GitHub%20commits%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on telegram" href="https://telegram.me/share/url?text=Email%20notifications%20on%20public%20GitHub%20commits&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Email notifications on public GitHub commits on ycombinator" href="https://news.ycombinator.com/submitlink?t=Email%20notifications%20on%20public%20GitHub%20commits&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f14%2femail-notifications-on-public-github-commits%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/index.html b/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/index.html
index 835c2b944..e33a889e2 100644
--- a/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/index.html
+++ b/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,business,career,marketing,personal,productivity,quotes"><meta name=description content="A summary of the first chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Revisiting Start Small, Stay Small in 2023 (Chapter 1)"><meta property="og:description" content="A summary of the first chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-08-16T05:45:00+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:31+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Revisiting Start Small, Stay Small in 2023 (Chapter 1)"><meta name=twitter:description content="A summary of the first chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Revisiting Start Small, Stay Small in 2023 (Chapter 1)","item":"https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Revisiting Start Small, Stay Small in 2023 (Chapter 1)","name":"Revisiting Start Small, Stay Small in 2023 (Chapter 1)","description":"A summary of the first chapter of Rob Walling\u0026rsquo;s Start Small, Stay Small, along with my thoughts \u0026amp; reflections.","keywords":["books","business","career","marketing","personal","productivity","quotes"],"articleBody":"I first read Start Small, Stay Small by Rob Walling in 2014, as I was working on a self-funded lifestyle business idea (aka micropreneurship). I ultimately abandoned the product I was working on in favour of salaried work, but now I’m considering new ideas, especially in the climate and nature-positive space.\nAs I spent the intervening years as an employee with VC-funded startups and with Automattic, my micropreneurship skills are much weaker than they could have been if I had stuck to building \u0026 selling products independently. Still, salaried work has its perks – I’m now in a more financially secure position than I was in 2014.\nAs part of getting back into micropreneurship, I figured it’d be worth rereading / skimming Start Small, Stay Small. This time, I’m using my TIL format to post some notes for my own reference.\nI originally thought this would be a single post on the entire book, but my summary and thoughts on Chapter 1 (The Chasm Between Developer and Entrepreneur) ended up on the lengthy side. My goal with the TIL format is to publish more frequently, so this is good enough for a standalone post.\nSummary:\nThe book is focused on micropreneurs (solo founders who want to remain solo) and bootstrappers (those who want to grow a business without taking external funding) – both follow a similar process. Seeking funding takes a lot of time and focus, and limits the markets you can pursue. It’s harder to justify aiming for niche markets and moderate success when you’re funded by venture capitalists who expect a substantial return on investment. Defining a self-funded startup entrepreneur: “technical visionary who creates software for a niche market” “merges existing technical knowledge with online marketing knowledge” “a cross between a developer, a webmaster, and a marketer” Wrong reasons to start: “having a product idea” – “without a market, a software application is just a project” “to get rich” “because it sounds like fun” The right reasons to start depend on your goals. Micropreneurs lean towards lifestyle choices (freedom \u0026 income/location-independence). Bootstrappers might lean more towards the challenge and excitement of ownership and control. It’s worth spending time clarifying goals. Communicating them publicly and creating an accountability system is helpful in following through. Suggested goal: Build a startup that generates a monthly profit of $500. It’s harder than it may sound. Goals are key to getting through “the dip”: The point where the work is so hard (e.g., high volume of support) it can become unbearable. Reasons why people switch from development to entrepreneurship: lack of learning (keeping up with new tech becomes less exciting) and wanting more ownership (when working for a salary, little equity is retained if you leave). Roadblocks to success and how to avoid them: “No market” – “building something no one wants”. Avoided by verifying there’s a market before building the product. “Fear”. Can’t be fully avoided, but this may help: “The up-front fear is a big indicator that you’re going to grow as a person if you proceed through it. And, frankly, the terror wears off pretty quickly.” “Lack of goals”, e.g., around profit growth and lifestyle. Avoided by defining your goals and writing them down. “Inconsistency”, doing pseudo-productive things such as reading business books – can’t consume information and produce at the same time. Avoided by setting limits on content consumption – asking yourself whether pseudo-productive activities are actually worth it. “Believing you have to do everything yourself”. Avoided by getting comfortable with outsourcing the right tasks to contractors and virtual assistants (e.g., probably outsource graphic design but don’t outsource the product architecture). Putting a dollar value on your work hours (i.e., dollarising your time) makes outsourcing decisions easier. It’s a step many entrepreneurs skip. This results in them performing menial tasks that can be outsourced, with an effective hourly rate that’s around the minimum wage or lower. Approaches to setting your current dollar value: (1) use freelancer rates; or (2) divide total compensation (including benefits) by work hours. Then set a desired rate. Don’t accept making something like $25 / hour. Make your target rate a reality as soon as possible, then increase it. Realisations that come from dollarising your time: “Outsourcing is a bargain”. “Keep work and play separate” – “work hard and play hard, but never do both at once”. Don’t do things like playing with your kids while working on your iPhone, as you’ll be doing both poorly. “Wasting time is bad” – unproductive non-leisure activities are wasted money. “Information consumption is only good when it produces something” (excluding consumption for leisure). Recommendation: “When reading blogs or books or listening to podcasts or audio books, take action notes.” If no actions arise, it may be that the content is low value. Realisations that come when transitioning from developer to entrepreneur: “Being a good technician is not enough”. It’s critical to do management work like thinking about return on investment and productivity, and visionary/creative work around the long-term direction of the business. This is a key component in escaping the $25 / hour pit. “Market comes first, marketing second, aesthetic third, and functionality a distant fourth”. “Things will never be as clear as you want them to be” – writing code is straightforward in comparison to the ever-changing market, which requires a lot of experimentation to get right. “You can’t specify everything, but you do need a plan”. “You need to fail fast and recover”. “You will never be done” – building and then collecting money is a pipe dream; product \u0026 marketing require continuous investment to remain successful. “Don’t expect instant gratification” – product/marketing/reputation require time and effort, and it’s way harder the first time. The real work begins after you launch. “Process is king” – having documented repeatable processes is key to delegation, bringing on partners, and avoiding mistakes. Such documentation makes it easier to sell the business if you want. “Nothing about a startup is a one-time effort” – getting to the point of an automated startup requires a wise choice of niche \u0026 product, as well as investment in outsourcing and automation. Things like marketing remain hard to outsource, though. Key quotes:\n“A developer who knows how to market a product is a rare (and powerful) combination.” “Marketing is more important than your product. […] Product Last. Marketing First.” “If you’re a venture-backed startup founder you’re looking at many years of long hours with a small potential for a huge payoff. […] If you’re a self-funded startup founder, you’re looking at a decent potential for a decent payoff.” “Without a market, a software application is just a project.” My thoughts:\nI’m surprised by the length of my summary! I thought it’d just be a couple of quotes, especially given that many of the specific examples and references haven’t aged well. But a lot of the key principles are still relevant today. It’s somewhat ironic that reading business books is described as non-productive given that Start Small, Stay Small is a business book, but I suppose that’s qualified by the later statement that information consumption is worthwhile when it leads to productive action notes. As I enjoy reading \u0026 learning, I can definitely relate to the sense of pseudo-productivity when going down information rabbit holes. Further, since the book was published, the number of ways to get distracted has kept increasing while the number of hours in a day hasn’t changed, so remaining focused is perhaps more of a challenge these days. Walling talks about not being able to get rich through salaried work (in the context of the desire to get rich being a wrong reason to start), but I disagree. It’s well-known that many tech employees earn well, even outside the big tech companies (where total compensation can be in the high six figures or even in the seven figures). Working consistently for a salary and keeping expenses under control is a safe way to get rich, but it can be hard (see the FIRE movement). In my case, I would have been richer now if I had joined Google after my PhD in 2012 (I interned there and chose to work with small startups instead), or if I had not done a PhD and stayed in Israel to work with big tech companies in 2009 (tech compensation in Israel is higher than in Australia), or if I had stayed with Automattic a couple of years ago and kept working full time. But life isn’t only about maximising material wealth – I’m happy with my choices. There are multiple references to long nights, which I assume mostly apply to people who work on a side-business in addition to a full-time job. I suppose there’s no avoiding some unpleasant work at inconvenient times, but applying time discipline is important, especially if control over how you spend your time is a motivator for going down the micropreneurship path. One segment that feels dated in 2023 discusses examples of tasks that can be easily outsourced, like one-off scraping of images from a website and making some CSS tweaks. These days, it’s cheaper \u0026 faster to prompt ChatGPT or one of its cousins to get such tasks done. Echoing Michael Lynch’s review of the book, I also have my doubts about Walling’s advice on outsourcing, but there are definitely tasks that can and should be outsourced. I remembered the book as being too militant about dollarising time. The simple fact is that time can’t be saved like money (e.g., when you die your heirs don’t get to enjoy all the time you’ve saved). However, at least in chapter 1, it’s clear that dollarising time refers to work time – leisure time is a different story. It makes perfect sense to be diligent about how work time is spent and aim to at least match the market value of your time when running a for-profit business. Still, it’s hard to put a price on joy and purpose found in work – many people are happy to take a pay cut for more fulfilling work. Fulfillment isn’t captured by the crude metric of dollars earned per hour worked. Nonetheless, at least being aware of your effective hourly rate as a micropreneur seems like a good guardrail. To dollarise time, you need to know how many hours you work. But the reality is that we use the same brain for work and outside work, and total control over thoughts is impossible. I doubt it’s even desirable, as many good ideas come when we’re not “at work”. Still, it’s worth striving for some separation between work and non-work time, especially given the ubiquity of internet connections and mobile devices. On this read, I found the paragraphs that talked about boredom with keeping up with tech especially relatable. While there have been transformative changes in tech over the past decade, anything useful quickly becomes a software commodity – less exciting from a technical perspective. Everyday tech work often requires putting the right lego pieces together and dealing with human problems. While I still find some satisfaction in the technical aspects of building software and exploring data, I’m less interested in tech for tech’s sake these days. I abandoned my last micropreneurship attempt because I came to the same realisation around the real work beginning after launch. I just didn’t care enough about my online price comparison product to put in months and years of effort into marketing. This is partly what’s missing from chapter 1 (though I suppose it’s somewhat covered by setting goals): For some people, aligning business value with personal values is key to putting in consistent effort. It’s easy to give up when there’s more money and less stress in salaried work, and you don’t care about the product you’ve built. If I decide to follow the micropreneur path again, one of my aims is to have better value alignment than last time. Key action item from this chapter: Think deeply on goals and commit to them.\n","wordCount":"1998","inLanguage":"en","datePublished":"2023-08-16T05:45:00Z","dateModified":"2024-03-12T16:33:31+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Revisiting Start Small, Stay Small in 2023 (Chapter 1)</h1><div class=post-meta><span title='2023-08-16 05:45:00 +0000 UTC'>August 16, 2023</span></div></header><div class=post-content><p>I first read <a href=https://startsmall.com/ target=_blank rel=noopener><em>Start Small, Stay Small</em></a> by Rob Walling in 2014, as I was working on <a href=https://yanirseroussi.com/2015/03/22/the-long-road-to-a-lifestyle-business/>a self-funded lifestyle business idea</a> (aka micropreneurship). I ultimately abandoned the product I was working on in favour of salaried work, but now I&rsquo;m considering new ideas, especially in the climate and nature-positive space.</p><p>As I spent the intervening years as an employee with <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>VC-funded startups</a> and with <a href=https://yanirseroussi.com/2021/10/07/my-work-with-automattic/>Automattic</a>, my micropreneurship skills are much weaker than they could have been if I had stuck to building & selling products independently. Still, salaried work has its perks – I&rsquo;m now in a more financially secure position than I was in 2014.</p><p>As part of getting back into micropreneurship, I figured it&rsquo;d be worth rereading / skimming <em>Start Small, Stay Small</em>. This time, I&rsquo;m using my TIL format to post some notes for my own reference.</p><p>I originally thought this would be a single post on the entire book, but my summary and thoughts on Chapter 1 (<em>The Chasm Between Developer and Entrepreneur</em>) ended up on the lengthy side. My goal with the TIL format is to publish more frequently, so this is good enough for a standalone post.</p><p><strong>Summary:</strong></p><ul><li>The book is focused on micropreneurs (solo founders who want to remain solo) and bootstrappers (those who want to grow a business without taking external funding) – both follow a similar process.</li><li>Seeking funding takes a lot of time and focus, and limits the markets you can pursue. It&rsquo;s harder to justify aiming for niche markets and moderate success when you&rsquo;re funded by venture capitalists who expect a substantial return on investment.</li><li>Defining a self-funded startup entrepreneur:<ol><li><em>&ldquo;technical visionary who creates software for a niche market&rdquo;</em></li><li><em>&ldquo;merges existing technical knowledge with online marketing knowledge&rdquo;</em></li><li><em>&ldquo;a cross between a developer, a webmaster, and a marketer&rdquo;</em></li></ol></li><li><em>Wrong</em> reasons to start:<ol><li><em>&ldquo;having a product idea&rdquo;</em> – <em>&ldquo;without a market, a software application is just a project&rdquo;</em></li><li><em>&ldquo;to get rich&rdquo;</em></li><li><em>&ldquo;because it sounds like fun&rdquo;</em></li></ol></li><li>The <em>right</em> reasons to start depend on your goals. Micropreneurs lean towards lifestyle choices (freedom & income/location-independence). Bootstrappers might lean more towards the challenge and excitement of ownership and control. It&rsquo;s worth spending time clarifying goals. Communicating them publicly and creating an accountability system is helpful in following through.</li><li>Suggested goal: Build a startup that generates a monthly profit of $500. It&rsquo;s harder than it may sound.</li><li>Goals are key to getting through &ldquo;the dip&rdquo;: The point where the work is so hard (e.g., high volume of support) it can become unbearable.</li><li>Reasons why people switch from development to entrepreneurship: lack of learning (keeping up with new tech becomes less exciting) and wanting more ownership (when working for a salary, little equity is retained if you leave).</li><li>Roadblocks to success and how to avoid them:<ol><li><em>&ldquo;No market&rdquo;</em> – <em>&ldquo;building something no one wants&rdquo;</em>. Avoided by verifying there&rsquo;s a market before building the product.</li><li><em>&ldquo;Fear&rdquo;</em>. Can&rsquo;t be fully avoided, but this may help: <em>&ldquo;The up-front fear is a big indicator that you&rsquo;re going to grow as a person if you proceed through it. And, frankly, the terror wears off pretty quickly.&rdquo;</em></li><li><em>&ldquo;Lack of goals&rdquo;</em>, e.g., around profit growth and lifestyle. Avoided by defining your goals and writing them down.</li><li><em>&ldquo;Inconsistency&rdquo;</em>, doing pseudo-productive things such as reading business books – can&rsquo;t consume information and produce at the same time. Avoided by setting limits on content consumption – asking yourself whether pseudo-productive activities are actually worth it.</li><li><em>&ldquo;Believing you have to do everything yourself&rdquo;</em>. Avoided by getting comfortable with outsourcing <em>the right tasks</em> to contractors and virtual assistants (e.g., probably outsource graphic design but don&rsquo;t outsource the product architecture).</li></ol></li><li>Putting a dollar value on your work hours (i.e., dollarising your time) makes outsourcing decisions easier. It&rsquo;s a step many entrepreneurs skip. This results in them performing menial tasks that can be outsourced, with an effective hourly rate that&rsquo;s around the minimum wage or lower.</li><li>Approaches to setting your current dollar value: (1) use freelancer rates; or (2) divide total compensation (including benefits) by work hours. Then set a desired rate. Don&rsquo;t accept making something like $25 / hour. Make your target rate a reality as soon as possible, then increase it.</li><li>Realisations that come from dollarising your time:<ol><li><em>&ldquo;Outsourcing is a bargain&rdquo;</em>.</li><li><em>&ldquo;Keep work and play separate&rdquo;</em> – <em>&ldquo;work hard and play hard, but never do both at once&rdquo;</em>. Don&rsquo;t do things like playing with your kids while working on your iPhone, as you&rsquo;ll be doing both poorly.</li><li><em>&ldquo;Wasting time is bad&rdquo;</em> – unproductive non-leisure activities are wasted money.</li><li><em>&ldquo;Information consumption is only good when it produces something&rdquo;</em> (excluding consumption for leisure). Recommendation: <em>&ldquo;When reading blogs or books or listening to podcasts or audio books, take action notes.&rdquo;</em> If no actions arise, it may be that the content is low value.</li></ol></li><li>Realisations that come when transitioning from developer to entrepreneur:<ol><li><em>&ldquo;Being a good technician is not enough&rdquo;</em>. It&rsquo;s critical to do management work like thinking about return on investment and productivity, and visionary/creative work around the long-term direction of the business. This is a key component in escaping the $25 / hour pit.</li><li><em>&ldquo;Market comes first, marketing second, aesthetic third, and functionality a distant fourth&rdquo;</em>.</li><li><em>&ldquo;Things will never be as clear as you want them to be&rdquo;</em> – writing code is straightforward in comparison to the ever-changing market, which requires a lot of experimentation to get right.</li><li><em>&ldquo;You can&rsquo;t specify everything, but you do need a plan&rdquo;</em>.</li><li><em>&ldquo;You need to fail fast and recover&rdquo;</em>.</li><li><em>&ldquo;You will never be done&rdquo;</em> – building and then collecting money is a pipe dream; product & marketing require continuous investment to remain successful.</li><li><em>&ldquo;Don&rsquo;t expect instant gratification&rdquo;</em> – product/marketing/reputation require time and effort, and it&rsquo;s way harder the first time. The real work begins after you launch.</li><li><em>&ldquo;Process is king&rdquo;</em> – having documented repeatable processes is key to delegation, bringing on partners, and avoiding mistakes. Such documentation makes it easier to sell the business if you want.</li><li><em>&ldquo;Nothing about a startup is a one-time effort&rdquo;</em> – getting to the point of an automated startup requires a wise choice of niche & product, as well as investment in outsourcing and automation. Things like marketing remain hard to outsource, though.</li></ol></li></ul><p><strong>Key quotes:</strong></p><ul><li><em>&ldquo;A developer who knows how to market a product is a rare (and powerful) combination.&rdquo;</em></li><li><em>&ldquo;Marketing is more important than your product. [&mldr;] Product Last. Marketing First.&rdquo;</em></li><li><em>&ldquo;If you&rsquo;re a venture-backed startup founder you&rsquo;re looking at many years of long hours with a small potential for a huge payoff. [&mldr;] If you&rsquo;re a self-funded startup founder, you&rsquo;re looking at a decent potential for a decent payoff.&rdquo;</em></li><li><em>&ldquo;Without a market, a software application is just a project.&rdquo;</em></li></ul><p><strong>My thoughts:</strong></p><ul><li>I&rsquo;m surprised by the length of my summary! I thought it&rsquo;d just be a couple of quotes, especially given that many of the specific examples and references haven&rsquo;t aged well. But a lot of the key principles are still relevant today.</li><li>It&rsquo;s somewhat ironic that reading business books is described as non-productive given that <em>Start Small, Stay Small</em> is a business book, but I suppose that&rsquo;s qualified by the later statement that information consumption is worthwhile when it leads to productive action notes. As I enjoy reading & learning, I can definitely relate to the sense of pseudo-productivity when going down information rabbit holes. Further, since the book was published, the number of ways to get distracted has kept increasing while the number of hours in a day hasn&rsquo;t changed, so remaining focused is perhaps more of a challenge these days.</li><li>Walling talks about not being able to get rich through salaried work (in the context of the desire to get rich being a wrong reason to start), but I disagree. It&rsquo;s well-known that many tech employees earn well, even outside the big tech companies (where total compensation can be in the high six figures or even in the seven figures). Working consistently for a salary and keeping expenses under control is a safe way to get rich, but it can be hard (see <a href=https://en.wikipedia.org/wiki/FIRE_movement target=_blank rel=noopener>the FIRE movement</a>). In my case, I would have been richer now if I had joined Google after my PhD in 2012 (I interned there and chose to work with small startups instead), or if I had not done a PhD and stayed in Israel to work with big tech companies in 2009 (tech compensation in Israel is higher than in Australia), or if I had stayed with Automattic a couple of years ago and kept working full time. But life isn&rsquo;t only about maximising material wealth – I&rsquo;m happy with my choices.</li><li>There are multiple references to long nights, which I assume mostly apply to people who work on a side-business in addition to a full-time job. I suppose there&rsquo;s no avoiding some unpleasant work at inconvenient times, but applying time discipline is important, especially if control over how you spend your time is a motivator for going down the micropreneurship path.</li><li>One segment that feels dated in 2023 discusses examples of tasks that can be easily outsourced, like one-off scraping of images from a website and making some CSS tweaks. These days, it&rsquo;s cheaper & faster to prompt ChatGPT or one of its cousins to get such tasks done. Echoing <a href=https://mtlynch.io/book-reports/start-small-stay-small/ target=_blank rel=noopener>Michael Lynch&rsquo;s review of the book</a>, I also have my doubts about Walling&rsquo;s advice on outsourcing, but there are definitely tasks that can and should be outsourced.</li><li>I remembered the book as being too militant about dollarising time. The simple fact is that <a href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/>time can&rsquo;t be saved</a> like money (e.g., when you die your heirs don&rsquo;t get to enjoy all the time you&rsquo;ve saved). However, at least in chapter 1, it&rsquo;s clear that dollarising time refers to <em>work</em> time – leisure time is a different story. It makes perfect sense to be diligent about how <em>work</em> time is spent and aim to at least match the market value of your time when running a for-profit business. Still, it&rsquo;s hard to put a price on <a href=https://longform.asmartbear.com/fulfillment/ target=_blank rel=noopener>joy and purpose found in work</a> – many people are happy to take a pay cut for more fulfilling work. Fulfillment isn&rsquo;t captured by the crude metric of dollars earned per hour worked. Nonetheless, at least being aware of your effective hourly rate as a micropreneur seems like a good guardrail.</li><li>To dollarise time, you need to know how many hours you work. But the reality is that we use the same brain for work and outside work, and total control over thoughts is impossible. I doubt it&rsquo;s even desirable, as many good ideas come when we&rsquo;re not &ldquo;at work&rdquo;. Still, it&rsquo;s worth striving for some separation between work and non-work time, especially given the ubiquity of internet connections and mobile devices.</li><li>On this read, I found the paragraphs that talked about boredom with keeping up with tech especially relatable. While there have been transformative changes in tech over the past decade, <a href=https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/>anything useful quickly becomes a software commodity</a> – less exciting from a technical perspective. Everyday tech work often requires putting the right lego pieces together and dealing with human problems. While I still find some satisfaction in the technical aspects of building software and exploring data, I&rsquo;m less interested in tech for tech&rsquo;s sake these days.</li><li>I abandoned my last micropreneurship attempt because I came to the same realisation around the real work beginning after launch. I just didn&rsquo;t care enough about my online price comparison product to put in months and years of effort into marketing. This is partly what&rsquo;s missing from chapter 1 (though I suppose it&rsquo;s somewhat covered by setting goals): For some people, aligning business value with personal values is key to putting in consistent effort. It&rsquo;s easy to give up when there&rsquo;s more money and less stress in salaried work, and you don&rsquo;t care about the product you&rsquo;ve built. If I decide to follow the micropreneur path again, one of my aims is to have better value alignment than last time.</li></ul><p><strong>Key action item from this chapter:</strong> Think deeply on goals and commit to them.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/productivity/>Productivity</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on x" href="https://x.com/intent/tweet/?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f&amp;hashtags=books%2cbusiness%2ccareer%2cmarketing%2cpersonal%2cproductivity%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f&amp;title=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29&amp;summary=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f&title=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on whatsapp" href="https://api.whatsapp.com/send?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on telegram" href="https://telegram.me/share/url?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 1) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%201%29&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f16%2frevisiting-start-small-stay-small-in-2023-chapter-1%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/index.html b/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/index.html
index c945e9de4..49b4eecd2 100644
--- a/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/index.html
+++ b/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,business,career,marketing,personal,productivity,quotes"><meta name=description content="A summary of the second chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Revisiting Start Small, Stay Small in 2023 (Chapter 2)"><meta property="og:description" content="A summary of the second chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-08-17T07:45:00+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:31+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Revisiting Start Small, Stay Small in 2023 (Chapter 2)"><meta name=twitter:description content="A summary of the second chapter of Rob Walling&rsquo;s Start Small, Stay Small, along with my thoughts & reflections."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Revisiting Start Small, Stay Small in 2023 (Chapter 2)","item":"https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Revisiting Start Small, Stay Small in 2023 (Chapter 2)","name":"Revisiting Start Small, Stay Small in 2023 (Chapter 2)","description":"A summary of the second chapter of Rob Walling\u0026rsquo;s Start Small, Stay Small, along with my thoughts \u0026amp; reflections.","keywords":["books","business","career","marketing","personal","productivity","quotes"],"articleBody":"Following my previous TIL post on Chapter 1 of Start Small, Stay Small by Rob Walling, this post covers Chapter 2: Why Niches Are the Name of the Game.\nSummary:\nReiterating the need to find a market before building a product, which is the opposite of many developers’ approach of building a product first. If you get lucky, you can find success with building the product first, but the odds aren’t in your favour. It’s a similar story for many VC-backed startups. Reasons why people still follow the VC-backed startup: The lottery factor of a massive success / cashing out, and potential personal popularity. If the reason is cashing out, it’s worth thinking what you’d do later. If the goal is to work on projects you enjoy, you can start now without taking VC funding. Reasons you must go niche: “A niche requires you to narrow your product focus”. For example, you can build the perfect product for a single person you know well. Expanding this to a group of people that you keep happy leads to guaranteed revenue. “Niche advertising is more cost-effective”. “Niches have less competition”. Big companies don’t bother with products for small markets, e.g., Microsoft wouldn’t build a product for a market that generates an annual revenue of $500,000. “Niches have higher profit margins”. This is a result of having less competition. “Niche markets are not used to good marketing”. “It’s easier for prospects to trust you”. This is because they’re more likely to hear about you multiple times (related to the cost-effectiveness of advertising). Warm niches exercise: Write up a table with two columns. Then fill it up with names of people you know (column 1) and their work experiences / hobbies (column 2). Common sentiment: All good niches are already taken. People starting out want an indication that the chosen niche is going to work, but it requires a leap of faith. Quoting previous chapter: “things will never be as clear as you want them to be”. Approaches to brainstorming niches (warm niches are best to increase your chances of success, but some cold niches can be made warm through networking): “Look at all areas of your life”. “Look at occupations”. “Cheat” by going through lists of existing ideas (the book includes many links such as A Startup A Day, where the last post is from March 2011). Evaluating a niche: Focus on consumers and/or small businesses as they have purchasing authority, make decisions fast, and search for online solutions (not the case for large companies and government agencies, for example). Check that the market is large enough. Rule of thumb: can place an ad in a specific magazine for less than $5,000 (though offline magazine advertising isn’t recommended, and the inexistence of a magazine doesn’t imply that the market isn’t viable). Can also check labour statistics for the number of practitioners in a field. Be wary of markets that don’t have dedicated websites or magazines, and where labour statistics claim less than 10,000 members. Check that there’s an inexpensive way to reach customers (typically online) – your niche is unlikely to be people who visit tech sites. The focus should be on where your customers are (e.g., a niche website). Top approaches for micropreneurs: building an audience and search engine optimisation. Secondary approaches include referral traffic, partnerships, article marketing, and cold calling. It’s hard to generate a sustainable stream of prospects without the top approaches. It’s better to focus on vertical markets (e.g., single industry or hobby) than on horizontal markets because members of a vertical have similar behaviours, talk to each other, hang out together, and have similar needs. This makes them easier to target, and increases the chances of organic product growth. Horizontal markets are rarely a good idea for micropreneurs because they’re too large and expensive to navigate. Measuring market demand without spending money: Obtain likely conversion rates for your price point. Discover likely traffic volumes through keyword research tools (considering search engine traffic, incoming links, direct traffic, and advertising). Example: 5,000 people search for inventory software each month. If you rank #1 for the term and get another 5,000 visitors from other sources, with a 0.5% conversion rate and a $200 product, you’d make $10,000 per month. Exercise: Take the top five hobbies and occupations from the warm niche brainstorming, ranked by personal interest. Ask the person you know about their problems to uncover software needs and ideas. Then use keyword research tools to see if there are other ideas that the person didn’t mention. Measure demand using free keyword research tools. There are also paid tools that can help, especially with assessing the difficulty of ranking for a term (many free tools are junk). As a rule of thumb, multiply by four the number of searches for generic terms like attorney billing software to get an estimate of traffic if you rank #1 for the generic term (due to traffic from long tail terms and other sources of traffic). Check the competition – regardless of what the tools say, if the current top result is well-optimised and has a high PageRank, it’s going to be tough to beat. Testing an idea for less than $100 (works for SaaS but not for products that rely on network effects): Choose the most interesting idea from your shortlist of product options (following niche \u0026 market demand research). Set up a mini sales site with 2-3 pages (homepage, pricing / signup, and possibly a product tour). Try to get people to click a “buy now” or “free trial” button (depending on the product cost). Create an AdWords (now Google Ads) campaign to generate traffic. Track clicks to estimate the conversion rate and link it to keywords. Notify users who are interested that the product is still under development. If you feel bad about misleading potential customers, include a “Coming Soon” note somewhere. Key quotes:\n“The product with a sizable market and low competition wins even with bad marketing, a bad aesthetic, and poor functionality.” “With luck on your side you don’t need money, good marketing or a solid product. You just need to be lucky.” “What matters is finding a group of people who need your something more than they need the money you’re charging for it.” “The best niches are reserved for people who do something.” “As a self-funded startup you want a market that is already looking for your product, even if it doesn’t exist. This is because creating demand is very, very expensive while filling existing demand is, by comparison, cheap.” “If your target market is not online, you have no chance of succeeding using the methodologies you’ll find in this book. This is non-negotiable.” “When you receive 50,000 visitors from one of the major media sites you will be lucky to convert five sales.” “Unfortunately, great products are often built and launched without a thought given to how the target audience will find out about it. You must have an inexpensive, ongoing source of new customers.” “With niche research the problem is not finding new ideas, but narrowing to the most effective strategies that you can implement in a reasonable amount of time.” “You only need to master two skills to sell online: human behavior and math.” My thoughts:\nDespite agreeing with the overall message of focusing on niche markets, I find myself thinking of counter-examples (e.g., of people who got lucky playing the startup or product-first games). I suppose it’s similar to the note towards the end of Thinking, Fast and Slow – being aware of biases isn’t enough to eliminate them. Base rate neglect is one such bias. Much of the discussion around market size and the ability to reach customers reminded me of Jason Cohen’s post on the difference between successfully solving a problem and having a viable business model. As the post came out in 2023, it’s likely that people are still making the same mistakes, which is related to it being hard to fight our biases. I don’t fully agree with Walling’s note on choosing the micropreneur path as a way of working on enjoyable projects. Starting a business isn’t a good way to guarantee work on things you enjoy, unless you enjoy everything that comes with running a business. That is, I don’t believe that running a VC-funded startup is that different from running a bootstrapped startup when it comes to enjoyment – there will always be unpleasant tasks. In line with my post on Chapter 1, I do believe that with a bootstrapped startup the founders have more control over aligning the work with their values – it’s hard to say no to investors who are essentially your bosses. Theoretically, with a bootstrapped startup it’s easier to say no to certain business activities and forgo the potential market share that comes with them, if pursuing such activities disagrees with your values – micropreneurs don’t have to pursue growth at all costs. The book includes outdated references to magazine \u0026 newspaper ads, but it’s easy to mentally translate the concepts to today’s tech. I suppose the equivalents today are niche sites / newsletters / online magazines / influencer channels where one can advertise. The underlying principles age slower than specific technologies and tools. A trap I hit the last time I read the book in 2014 is having a product idea and then insisting that a niche exists to match the product idea. It’s probably easier to start without a product idea, as we tend to fall in love with our ideas. Mentions of Web 2.0 as a shiny new thing bring back memories, and show how little has changed conceptually. The recent hype around blockchain and web3 looked a lot like solutions looking for problems to me. It’s hard to learn from the experience and advice of others. Between 2014 and now, I’ve seen multiple examples of failures that are due to not following (or being aware of) key advice from the book. Still, I’m somewhat swayed by misleading media noise and specific founder stories. I’m not in the marketing world, so I’m curious to what extent things have changed around specific tooling \u0026 approaches. Walling’s newer books might help (and there are a million and one others). But again, principles are key – just like in data science and software engineering where new tools often re-implement old ideas. Similarly, I’m curious to what extent website traffic is important these days, as websites aren’t the only way to reach people online (e.g., there are app stores and social media channels). However, thinking about impressions and conversion rates is important regardless of the medium through which they’re obtained. The specifics of market demand research seem outdated. It’s probably better to rely on newer focused resources for specifics, though people still use search engines. Free keyword research tools still exist in 2023 and are probably a good way to get started. Following the exact tips from the book too closely would be silly, though. A mini sales site makes perfect sense, and is probably still cheap to build and test. It may even be cheaper today given the abundance of templates and tools to build static websites. Further, static sites can be hosted for free, with the only cost being the domain name (that’s the case for my site). Key action items from this chapter:\nDo the warm niche exercise Find niches worth targeting (warmness is important, but alignment with interests and values is key) Evaluate niche market size Figure out how to reach promising niches Interview people in the promising niches Measure market demand for specific ideas Narrow down the list of ideas Build and market a mini sales site to gauge feasibility That’s plenty of work, so I probably won’t get to the next chapter for a while!\n","wordCount":"1967","inLanguage":"en","datePublished":"2023-08-17T07:45:00Z","dateModified":"2024-03-12T16:33:31+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Revisiting Start Small, Stay Small in 2023 (Chapter 2)</h1><div class=post-meta><span title='2023-08-17 07:45:00 +0000 UTC'>August 17, 2023</span></div></header><div class=post-content><p>Following <a href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/>my previous TIL post on Chapter 1</a> of <a href=https://startsmall.com/ target=_blank rel=noopener><em>Start Small, Stay Small</em></a> by Rob Walling, this post covers Chapter 2: <em>Why Niches Are the Name of the Game</em>.</p><p><strong>Summary:</strong></p><ul><li>Reiterating the need to find a market before building a product, which is the opposite of many developers&rsquo; approach of building a product first.</li><li>If you get lucky, you can find success with building the product first, but the odds aren&rsquo;t in your favour. It&rsquo;s a similar story for many VC-backed startups.</li><li>Reasons why people still follow the VC-backed startup: The lottery factor of a massive success / cashing out, and potential personal popularity. If the reason is cashing out, it&rsquo;s worth thinking what you&rsquo;d do later. If the goal is to work on projects you enjoy, you can start now without taking VC funding.</li><li>Reasons you must go niche:<ol><li><em>&ldquo;A niche requires you to narrow your product focus&rdquo;</em>. For example, you can build the perfect product for a single person you know well. Expanding this to a group of people that you keep happy leads to guaranteed revenue.</li><li><em>&ldquo;Niche advertising is more cost-effective&rdquo;</em>.</li><li><em>&ldquo;Niches have less competition&rdquo;</em>. Big companies don&rsquo;t bother with products for small markets, e.g., Microsoft wouldn&rsquo;t build a product for a market that generates an annual revenue of $500,000.</li><li><em>&ldquo;Niches have higher profit margins&rdquo;</em>. This is a result of having less competition.</li><li><em>&ldquo;Niche markets are not used to good marketing&rdquo;</em>.</li><li><em>&ldquo;It&rsquo;s easier for prospects to trust you&rdquo;</em>. This is because they&rsquo;re more likely to hear about you multiple times (related to the cost-effectiveness of advertising).</li></ol></li><li>Warm niches exercise: Write up a table with two columns. Then fill it up with names of people you know (column 1) and their work experiences / hobbies (column 2).</li><li>Common sentiment: All good niches are already taken. People starting out want an indication that the chosen niche is going to work, but it requires a leap of faith. Quoting previous chapter: <em>&ldquo;things will never be as clear as you want them to be&rdquo;</em>.</li><li>Approaches to brainstorming niches (warm niches are best to increase your chances of success, but some cold niches can be made warm through networking):<ol><li><em>&ldquo;Look at all areas of your life&rdquo;</em>.</li><li><em>&ldquo;Look at occupations&rdquo;</em>.</li><li><em>&ldquo;Cheat&rdquo;</em> by going through lists of existing ideas (the book includes many links such as <a href=https://astartupaday.wordpress.com/ target=_blank rel=noopener>A Startup A Day</a>, where the last post is from March 2011).</li></ol></li><li>Evaluating a niche:<ul><li>Focus on consumers and/or small businesses as they have purchasing authority, make decisions fast, and search for online solutions (not the case for large companies and government agencies, for example).</li><li>Check that the market is large enough. Rule of thumb: can place an ad in a specific magazine for less than $5,000 (though offline magazine advertising isn&rsquo;t recommended, and the inexistence of a magazine doesn&rsquo;t imply that the market isn&rsquo;t viable). Can also check labour statistics for the number of practitioners in a field. Be wary of markets that don&rsquo;t have dedicated websites or magazines, and where labour statistics claim less than 10,000 members.</li><li>Check that there&rsquo;s an inexpensive way to reach customers (typically online) – your niche is unlikely to be people who visit tech sites. The focus should be on where your customers are (e.g., a niche website). Top approaches for micropreneurs: building an audience and search engine optimisation. Secondary approaches include referral traffic, partnerships, article marketing, and cold calling. It&rsquo;s hard to generate a sustainable stream of prospects without the top approaches.</li></ul></li><li>It&rsquo;s better to focus on vertical markets (e.g., single industry or hobby) than on horizontal markets because members of a vertical have similar behaviours, talk to each other, hang out together, and have similar needs. This makes them easier to target, and increases the chances of organic product growth. Horizontal markets are rarely a good idea for micropreneurs because they&rsquo;re too large and expensive to navigate.</li><li>Measuring market demand without spending money:<ul><li>Obtain likely conversion rates for your price point.</li><li>Discover likely traffic volumes through keyword research tools (considering search engine traffic, incoming links, direct traffic, and advertising).</li><li>Example: 5,000 people search for <em>inventory software</em> each month. If you rank #1 for the term and get another 5,000 visitors from other sources, with a 0.5% conversion rate and a $200 product, you&rsquo;d make $10,000 per month.</li><li>Exercise: Take the top five hobbies and occupations from the warm niche brainstorming, ranked by personal interest. Ask the person you know about their problems to uncover software needs and ideas. Then use keyword research tools to see if there are other ideas that the person didn&rsquo;t mention.</li><li>Measure demand using free keyword research tools. There are also paid tools that can help, especially with assessing the difficulty of ranking for a term (many free tools are junk). As a rule of thumb, multiply by four the number of searches for generic terms like <em>attorney billing software</em> to get an estimate of traffic if you rank #1 for the generic term (due to traffic from long tail terms and other sources of traffic).</li><li>Check the competition – regardless of what the tools say, if the current top result is well-optimised and has a high PageRank, it&rsquo;s going to be tough to beat.</li></ul></li><li>Testing an idea for less than $100 (works for SaaS but not for products that rely on network effects):<ol><li>Choose the most interesting idea from your shortlist of product options (following niche & market demand research).</li><li>Set up a mini sales site with 2-3 pages (homepage, pricing / signup, and possibly a product tour).</li><li>Try to get people to click a &ldquo;buy now&rdquo; or &ldquo;free trial&rdquo; button (depending on the product cost).</li><li>Create an AdWords (now Google Ads) campaign to generate traffic.</li><li>Track clicks to estimate the conversion rate and link it to keywords. Notify users who are interested that the product is still under development.</li><li>If you feel bad about misleading potential customers, include a &ldquo;Coming Soon&rdquo; note somewhere.</li></ol></li></ul><p><strong>Key quotes:</strong></p><ul><li><em>&ldquo;The product with a sizable market and low competition wins even with bad marketing, a bad aesthetic, and poor functionality.&rdquo;</em></li><li><em>&ldquo;With luck on your side you don&rsquo;t need money, good marketing or a solid product. You just need to be lucky.&rdquo;</em></li><li><em>&ldquo;What matters is finding a group of people who need your</em> something <em>more than they need the money you&rsquo;re charging for it.&rdquo;</em></li><li><em>&ldquo;The best niches are reserved for people who do something.&rdquo;</em></li><li><em>&ldquo;As a self-funded startup you want a market that is already looking for your product, even if it doesn&rsquo;t exist. This is because creating demand is very, very expensive while filling existing demand is, by comparison, cheap.&rdquo;</em></li><li><em>&ldquo;If your target market is not online, you have no chance of succeeding using the methodologies you&rsquo;ll find in this book. This is non-negotiable.&rdquo;</em></li><li><em>&ldquo;When you receive 50,000 visitors from one of the major media sites you will be lucky to convert five sales.&rdquo;</em></li><li><em>&ldquo;Unfortunately, great products are often built and launched without a thought given to how the target audience will find out about it. You must have an inexpensive, ongoing source of new customers.&rdquo;</em></li><li><em>&ldquo;With niche research the problem is not finding new ideas, but narrowing to the most effective strategies that you can implement in a reasonable amount of time.&rdquo;</em></li><li><em>&ldquo;You only need to master two skills to sell online: human behavior and math.&rdquo;</em></li></ul><p><strong>My thoughts:</strong></p><ul><li>Despite agreeing with the overall message of focusing on niche markets, I find myself thinking of counter-examples (e.g., of people who got lucky playing the startup or product-first games). I suppose it&rsquo;s similar to the note towards the end of <a href=https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow target=_blank rel=noopener>Thinking, Fast and Slow</a> – being aware of biases isn&rsquo;t enough to eliminate them. <a href=https://en.wikipedia.org/wiki/Base_rate_fallacy target=_blank rel=noopener>Base rate neglect</a> is one such bias.</li><li>Much of the discussion around market size and the ability to reach customers reminded me of <a href=https://longform.asmartbear.com/problem/ target=_blank rel=noopener>Jason Cohen&rsquo;s post on the difference between successfully solving a problem and having a viable business model</a>. As the post came out in 2023, it&rsquo;s likely that people are still making the same mistakes, which is related to it being hard to fight our biases.</li><li>I don&rsquo;t fully agree with Walling&rsquo;s note on choosing the micropreneur path as a way of working on enjoyable projects. Starting a business isn&rsquo;t a good way to guarantee work on things you enjoy, unless you enjoy everything that comes with running a business. That is, I don&rsquo;t believe that running a VC-funded startup is <em>that</em> different from running a bootstrapped startup when it comes to enjoyment – there will always be unpleasant tasks. In line with my post on Chapter 1, I do believe that with a bootstrapped startup the founders have more control over aligning the work with their values – it&rsquo;s hard to say no to investors who are essentially your bosses. Theoretically, with a bootstrapped startup it&rsquo;s easier to say no to certain business activities and forgo the potential market share that comes with them, if pursuing such activities disagrees with your values – micropreneurs don&rsquo;t have to pursue <a href=http://www.paulgraham.com/growth.html target=_blank rel=noopener>growth at all costs</a>.</li><li>The book includes outdated references to magazine & newspaper ads, but it&rsquo;s easy to mentally translate the concepts to today&rsquo;s tech. I suppose the equivalents today are niche sites / newsletters / online magazines / influencer channels where one can advertise. The underlying principles age slower than specific technologies and tools.</li><li>A trap I hit the last time I read the book in 2014 is having a product idea and then insisting that a niche exists to match the product idea. It&rsquo;s probably easier to start without a product idea, as we tend to fall in love with our ideas.</li><li>Mentions of Web 2.0 as a shiny new thing bring back memories, and show how little has changed conceptually. The recent hype around blockchain and web3 looked a lot like solutions looking for problems to me.</li><li>It&rsquo;s hard to learn from the experience and advice of others. Between 2014 and now, I&rsquo;ve seen multiple examples of failures that are due to not following (or being aware of) key advice from the book. Still, I&rsquo;m somewhat swayed by misleading media noise and specific founder stories.</li><li>I&rsquo;m not in the marketing world, so I&rsquo;m curious to what extent things have changed around specific tooling & approaches. <a href=https://robwalling.com/#books target=_blank rel=noopener>Walling&rsquo;s newer books</a> might help (and there are a million and one others). But again, principles are key – just like in data science and software engineering where new tools often re-implement old ideas.</li><li>Similarly, I&rsquo;m curious to what extent website traffic is important these days, as websites aren&rsquo;t the only way to reach people online (e.g., there are app stores and social media channels). However, thinking about impressions and conversion rates is important regardless of the medium through which they&rsquo;re obtained.</li><li>The specifics of market demand research seem outdated. It&rsquo;s probably better to rely on newer focused resources for specifics, though people still use search engines. Free keyword research tools still exist in 2023 and are probably a good way to get started. Following the exact tips from the book <em>too</em> closely would be silly, though.</li><li>A mini sales site makes perfect sense, and is probably still cheap to build and test. It may even be cheaper today given the abundance of templates and tools to build static websites. Further, static sites can be hosted for free, with the only cost being the domain name (<a href=https://yanirseroussi.com/2021/11/10/migrating-from-wordpress-com-to-hugo-on-github-cloudflare/>that&rsquo;s the case for my site</a>).</li></ul><p><strong>Key action items from this chapter:</strong></p><ul><li>Do the warm niche exercise</li><li>Find niches worth targeting (warmness is important, but alignment with interests and values is key)</li><li>Evaluate niche market size</li><li>Figure out how to reach promising niches</li><li>Interview people in the promising niches</li><li>Measure market demand for specific ideas</li><li>Narrow down the list of ideas</li><li>Build and market a mini sales site to gauge feasibility</li></ul><p>That&rsquo;s plenty of work, so I probably won&rsquo;t get to the next chapter for a while!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/productivity/>Productivity</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on x" href="https://x.com/intent/tweet/?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f&amp;hashtags=books%2cbusiness%2ccareer%2cmarketing%2cpersonal%2cproductivity%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f&amp;title=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29&amp;summary=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f&title=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on whatsapp" href="https://api.whatsapp.com/send?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on telegram" href="https://telegram.me/share/url?text=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Revisiting Start Small, Stay Small in 2023 (Chapter 2) on ycombinator" href="https://news.ycombinator.com/submitlink?t=Revisiting%20Start%20Small%2c%20Stay%20Small%20in%202023%20%28Chapter%202%29&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f17%2frevisiting-start-small-stay-small-in-2023-chapter-2%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/index.html b/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/index.html
index e3857c407..c4493c4d4 100644
--- a/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/index.html
+++ b/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,business,career,personal"><meta name=description content="While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder&rsquo;s experience."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The Minimalist Entrepreneur is too prescriptive for me"><meta property="og:description" content="While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder&rsquo;s experience."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-08-21T03:15:00+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:31+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The Minimalist Entrepreneur is too prescriptive for me"><meta name=twitter:description content="While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder&rsquo;s experience."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The Minimalist Entrepreneur is too prescriptive for me","item":"https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The Minimalist Entrepreneur is too prescriptive for me","name":"The Minimalist Entrepreneur is too prescriptive for me","description":"While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder\u0026rsquo;s experience.","keywords":["books","business","career","personal"],"articleBody":"I picked up The Minimalist Entrepreneur by Sahil Lavingia from the local library after coming across Gumroad a few times and getting intrigued by its creation story. I was hoping for general but useful advice on starting a bootstrapped business, but was disappointed by the overly prescriptive nature of the book. It feels like the author over-generalised from his own experience, e.g., in emphasising organic community-based growth over other marketing channels, and in insisting that one must share the story of building the business on social media. Example quote: “people don’t care about your business and its success, they care about you and your struggles” – I’d say that most people don’t care about either you or your business, but I can see how Sahil’s story is interesting to Gumroad’s users (who are also small business owners / creators). There are countless examples of products I use where I know nothing about the people who make them. I use those products because they address a need or a want.\nThe book was off to a good start with Sahil’s story and advice that is in line with advice from Start Small, Stay Small, which I revisited recently. However, beyond around the midpoint of the book, I started losing patience with its prescriptiveness and began speed-reading and skipping irrelevant stories. As one Goodreads reviewer said, the book felt like “three blog posts in a trench coat”, which I suppose is unsurprising as it was born out of a blog post.\nGiven that it was a quick weekend read, the book wasn’t a complete waste of time. It probably would have been better if it remained focused on Sahil’s own story, which I found the most engaging. The best advice is perhaps in the intro: “You definitely do not need to finish this book to start. […] You start, then learn”.\n","wordCount":"309","inLanguage":"en","datePublished":"2023-08-21T03:15:00Z","dateModified":"2024-03-12T16:33:31+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The Minimalist Entrepreneur is too prescriptive for me</h1><div class=post-meta><span title='2023-08-21 03:15:00 +0000 UTC'>August 21, 2023</span></div></header><div class=post-content><p>I picked up <a href=https://askmybook.com/ target=_blank rel=noopener><em>The Minimalist Entrepreneur</em></a> by Sahil Lavingia from the local library after coming across <a href=https://gumroad.com/ target=_blank rel=noopener>Gumroad</a> a few times and getting intrigued by <a href=https://sahillavingia.com/reflecting target=_blank rel=noopener>its creation story</a>. I was hoping for general but useful advice on starting a bootstrapped business, but was disappointed by the overly prescriptive nature of the book. It feels like the author over-generalised from his own experience, e.g., in emphasising organic community-based growth over other marketing channels, and in insisting that one must share the story of building the business on social media. Example quote: <em>&ldquo;people don&rsquo;t care about your business and its success, they care about you and your struggles&rdquo;</em> – I&rsquo;d say that most people don&rsquo;t care about either you or your business, but I can see how Sahil&rsquo;s story is interesting to Gumroad&rsquo;s users (who are also small business owners / creators). There are countless examples of products I use where I know nothing about the people who make them. I use those products because they address a need or a want.</p><p>The book was off to a good start with Sahil&rsquo;s story and advice that is in line with advice from <em>Start Small, Stay Small</em>, which I <a href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/>revisited recently</a>. However, beyond around the midpoint of the book, I started losing patience with its prescriptiveness and began speed-reading and skipping irrelevant stories. As <a href=https://www.goodreads.com/review/show/5204025999 target=_blank rel=noopener>one Goodreads reviewer said</a>, the book felt like <em>&ldquo;three blog posts in a trench coat&rdquo;</em>, which I suppose is unsurprising as it was born out of a blog post.</p><p>Given that it was a quick weekend read, the book wasn&rsquo;t a complete waste of time. It probably would have been better if it remained focused on Sahil&rsquo;s own story, which I found the most engaging. The best advice is perhaps in the intro: <em>&ldquo;You definitely do not need to finish this book to start. [&mldr;] You start, then learn&rdquo;</em>.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on x" href="https://x.com/intent/tweet/?text=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f&amp;hashtags=books%2cbusiness%2ccareer%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f&amp;title=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me&amp;summary=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f&title=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on whatsapp" href="https://api.whatsapp.com/send?text=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on telegram" href="https://telegram.me/share/url?text=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The Minimalist Entrepreneur is too prescriptive for me on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20Minimalist%20Entrepreneur%20is%20too%20prescriptive%20for%20me&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f08%2f21%2fthe-minimalist-entrepreneur-is-too-prescriptive-for-me%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/index.html b/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/index.html
index 0ebce7068..0132aad1b 100644
--- a/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/index.html
+++ b/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,data science,machine learning,software engineering"><meta name=description content="Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Google's Rules of Machine Learning still apply in the age of large language models"><meta property="og:description" content="Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-09-21T21:30:00+00:00"><meta property="article:modified_time" content="2023-09-22T07:54:13+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Google's Rules of Machine Learning still apply in the age of large language models"><meta name=twitter:description content="Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Google's Rules of Machine Learning still apply in the age of large language models","item":"https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Google's Rules of Machine Learning still apply in the age of large language models","name":"Google\u0027s Rules of Machine Learning still apply in the age of large language models","description":"Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.","keywords":["artificial intelligence","data science","machine learning","software engineering"],"articleBody":"I heard about Google’s Rules of Machine Learning (ML) maybe 4-5 years ago. Much like Steve McConnell’s classic software engineering mistakes, the rules capture lessons learned from software engineering projects, though they are focused on the problems that arise from the engineering problem of shipping ML systems to production.\nDespite the excitement about playing with data and models, the reality of building ML systems is that it’s mostly an engineering problem. This remains the case in the age of large language models. Perhaps it’s even more so because integrating language models into a product can be as simple as calling an API, which should make it easier to focus on business problems, pipelines, data, and evaluation. It’s important to remember that at an abstract level, ML is just a data transformation – there is no magic involved.\nAs the page containing Google’s ML rules is long and detailed, I put together this TIL post for my own ease of reference. It contains the key quote from the overview, along with the rules without their explanations. Go to the source for further details.\nOverview To make great products:\ndo machine learning like the great engineer you are, not like the great machine learning expert you aren’t.\nMost of the problems you will face are, in fact, engineering problems. Even with all the resources of a great machine learning expert, most of the gains come from great features, not great machine learning algorithms. So, the basic approach is:\nMake sure your pipeline is solid end to end. Start with a reasonable objective. Add common-sense features in a simple way. Make sure that your pipeline stays solid. The Rules Before Machine Learning Don’t be afraid to launch a product without machine learning. First, design and implement metrics. Choose machine learning over a complex heuristic. ML Phase I: Your First Pipeline Keep the first model simple and get the infrastructure right. Test the infrastructure independently from the machine learning. Be careful about dropped data when copying pipelines. Turn heuristics into features, or handle them externally. Monitoring Know the freshness requirements of your system. Detect problems before exporting models. Watch for silent failures. Give feature columns owners and documentation. Your First Objective Don’t overthink which objective you choose to directly optimize. Choose a simple, observable and attributable metric for your first objective. Starting with an interpretable model makes debugging easier. Separate Spam Filtering and Quality Ranking in a Policy Layer. ML Phase II: Feature Engineering Plan to launch and iterate. Start with directly observed and reported features as opposed to learned features. Explore with features of content that generalize across contexts. Use very specific features when you can. Combine and modify existing features to create new features in human­-understandable ways. The number of feature weights you can learn in a linear model is roughly proportional to the amount of data you have. Clean up features you are no longer using. Human Analysis of the System You are not a typical end user. Measure the delta between models. When choosing models, utilitarian performance trumps predictive power. Look for patterns in the measured errors, and create new features. Try to quantify observed undesirable behavior. Be aware that identical short-term behavior does not imply identical long-term behavior. Training-Serving Skew The best way to make sure that you train like you serve is to save the set of features used at serving time, and then pipe those features to a log to use them at training time. Importance-weight sampled data, don’t arbitrarily drop it! Beware that if you join data from a table at training and serving time, the data in the table may change. Re-use code between your training pipeline and your serving pipeline whenever possible. If you produce a model based on the data until January 5th, test the model on the data from January 6th and after. In binary classification for filtering (such as spam detection or determining interesting emails), make small short-term sacrifices in performance for very clean data. Beware of the inherent skew in ranking problems. Avoid feedback loops with positional features. Measure Training/Serving Skew. ML Phase III: Slowed Growth, Optimization Refinement, and Complex Models Don’t waste time on new features if unaligned objectives have become the issue. Launch decisions are a proxy for long-term product goals. Keep ensembles simple. When performance plateaus, look for qualitatively new sources of information to add rather than refining existing signals. Don’t expect diversity, personalization, or relevance to be as correlated with popularity as you think they are. Your friends tend to be the same across different products. Your interests tend not to be. ","wordCount":"766","inLanguage":"en","datePublished":"2023-09-21T21:30:00Z","dateModified":"2023-09-22T07:54:13+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Google's Rules of Machine Learning still apply in the age of large language models</h1><div class=post-meta><span title='2023-09-21 21:30:00 +0000 UTC'>September 21, 2023</span></div></header><div class=post-content><p>I heard about <a href=https://developers.google.com/machine-learning/guides/rules-of-ml target=_blank rel=noopener>Google&rsquo;s Rules of Machine Learning (ML)</a> maybe 4-5 years ago. Much like <a href=https://yanirseroussi.com/2023/06/30/was-data-science-a-failure-mode-of-software-engineering/>Steve McConnell&rsquo;s classic software engineering mistakes</a>, the rules capture lessons learned from software engineering projects, though they are focused on the problems that arise from the engineering problem of shipping ML systems to production.</p><p>Despite the excitement about playing with data and models, the reality of building ML systems is that it&rsquo;s mostly an engineering problem. This remains the case in the age of large language models. Perhaps it&rsquo;s even more so because integrating language models into a product can be as simple as calling an API, which <em>should</em> make it easier to focus on business problems, pipelines, data, and evaluation. It&rsquo;s important to remember that at an abstract level, ML is just a data transformation – there is no magic involved.</p><p>As the page containing Google&rsquo;s ML rules is long and detailed, I put together this TIL post for my own ease of reference. It contains the key quote from the overview, along with the rules without their explanations. Go to <a href=https://developers.google.com/machine-learning/guides/rules-of-ml target=_blank rel=noopener>the source</a> for further details.</p><blockquote><h2 id=overview>Overview<a hidden class=anchor aria-hidden=true href=#overview>#</a></h2><p>To make great products:</p><p><strong>do machine learning like the great engineer you are, not like the great machine learning expert you aren&rsquo;t.</strong></p><p>Most of the problems you will face are, in fact, engineering problems. Even with all the resources of a great machine learning expert, most of the gains come from great features, not great machine learning algorithms. So, the basic approach is:</p><ol><li>Make sure your pipeline is solid end to end.</li><li>Start with a reasonable objective.</li><li>Add common-sense features in a simple way.</li><li>Make sure that your pipeline stays solid.</li></ol><h2 id=the-rules>The Rules<a hidden class=anchor aria-hidden=true href=#the-rules>#</a></h2><h3 id=before-machine-learning>Before Machine Learning<a hidden class=anchor aria-hidden=true href=#before-machine-learning>#</a></h3><ol><li>Don&rsquo;t be afraid to launch a product without machine learning.</li><li>First, design and implement metrics.</li><li>Choose machine learning over a complex heuristic.</li></ol><h3 id=ml-phase-i-your-first-pipeline>ML Phase I: Your First Pipeline<a hidden class=anchor aria-hidden=true href=#ml-phase-i-your-first-pipeline>#</a></h3><ol start=4><li>Keep the first model simple and get the infrastructure right.</li><li>Test the infrastructure independently from the machine learning.</li><li>Be careful about dropped data when copying pipelines.</li><li>Turn heuristics into features, or handle them externally.</li></ol><h3 id=monitoring>Monitoring<a hidden class=anchor aria-hidden=true href=#monitoring>#</a></h3><ol start=8><li>Know the freshness requirements of your system.</li><li>Detect problems before exporting models.</li><li>Watch for silent failures.</li><li>Give feature columns owners and documentation.</li></ol><h3 id=your-first-objective>Your First Objective<a hidden class=anchor aria-hidden=true href=#your-first-objective>#</a></h3><ol start=12><li>Don&rsquo;t overthink which objective you choose to directly optimize.</li><li>Choose a simple, observable and attributable metric for your first objective.</li><li>Starting with an interpretable model makes debugging easier.</li><li>Separate Spam Filtering and Quality Ranking in a Policy Layer.</li></ol><h3 id=ml-phase-ii-feature-engineering>ML Phase II: Feature Engineering<a hidden class=anchor aria-hidden=true href=#ml-phase-ii-feature-engineering>#</a></h3><ol start=16><li>Plan to launch and iterate.</li><li>Start with directly observed and reported features as opposed to learned features.</li><li>Explore with features of content that generalize across contexts.</li><li>Use very specific features when you can.</li><li>Combine and modify existing features to create new features in human­-understandable ways.</li><li>The number of feature weights you can learn in a linear model is roughly proportional to the amount of data you have.</li><li>Clean up features you are no longer using.</li></ol><h3 id=human-analysis-of-the-system>Human Analysis of the System<a hidden class=anchor aria-hidden=true href=#human-analysis-of-the-system>#</a></h3><ol start=23><li>You are not a typical end user.</li><li>Measure the delta between models.</li><li>When choosing models, utilitarian performance trumps predictive power.</li><li>Look for patterns in the measured errors, and create new features.</li><li>Try to quantify observed undesirable behavior.</li><li>Be aware that identical short-term behavior does not imply identical long-term behavior.</li></ol><h3 id=training-serving-skew>Training-Serving Skew<a hidden class=anchor aria-hidden=true href=#training-serving-skew>#</a></h3><ol start=29><li>The best way to make sure that you train like you serve is to save the set of features used at serving time, and then pipe those features to a log to use them at training time.</li><li>Importance-weight sampled data, don&rsquo;t arbitrarily drop it!</li><li>Beware that if you join data from a table at training and serving time, the data in the table may change.</li><li>Re-use code between your training pipeline and your serving pipeline whenever possible.</li><li>If you produce a model based on the data until January 5th, test the model on the data from January 6th and after.</li><li>In binary classification for filtering (such as spam detection or determining interesting emails), make small short-term sacrifices in performance for very clean data.</li><li>Beware of the inherent skew in ranking problems.</li><li>Avoid feedback loops with positional features.</li><li>Measure Training/Serving Skew.</li></ol><h3 id=ml-phase-iii-slowed-growth-optimization-refinement-and-complex-models>ML Phase III: Slowed Growth, Optimization Refinement, and Complex Models<a hidden class=anchor aria-hidden=true href=#ml-phase-iii-slowed-growth-optimization-refinement-and-complex-models>#</a></h3><ol start=38><li>Don&rsquo;t waste time on new features if unaligned objectives have become the issue.</li><li>Launch decisions are a proxy for long-term product goals.</li><li>Keep ensembles simple.</li><li>When performance plateaus, look for qualitatively new sources of information to add rather than refining existing signals.</li><li>Don&rsquo;t expect diversity, personalization, or relevance to be as correlated with popularity as you think they are.</li><li>Your friends tend to be the same across different products. Your interests tend not to be.</li></ol></blockquote></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on x" href="https://x.com/intent/tweet/?text=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f&amp;hashtags=artificialintelligence%2cdatascience%2cmachinelearning%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f&amp;title=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models&amp;summary=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f&title=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on whatsapp" href="https://api.whatsapp.com/send?text=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on telegram" href="https://telegram.me/share/url?text=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Google's Rules of Machine Learning still apply in the age of large language models on ycombinator" href="https://news.ycombinator.com/submitlink?t=Google%27s%20Rules%20of%20Machine%20Learning%20still%20apply%20in%20the%20age%20of%20large%20language%20models&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f21%2fgoogles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/index.html b/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/index.html
index 1b6a496c3..eeb1be152 100644
--- a/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/index.html
+++ b/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,personal"><meta name=description content="It turns out that problems like finding a niche and defining the ideal clients are key to any solo business."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The lines between solo consulting and product building are blurry"><meta property="og:description" content="It turns out that problems like finding a niche and defining the ideal clients are key to any solo business."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-09-25T00:00:00+00:00"><meta property="article:modified_time" content="2023-09-25T11:15:26+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The lines between solo consulting and product building are blurry"><meta name=twitter:description content="It turns out that problems like finding a niche and defining the ideal clients are key to any solo business."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The lines between solo consulting and product building are blurry","item":"https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The lines between solo consulting and product building are blurry","name":"The lines between solo consulting and product building are blurry","description":"It turns out that problems like finding a niche and defining the ideal clients are key to any solo business.","keywords":["business","career","personal"],"articleBody":"I’ve been thinking about starting a small product business recently – something niche enough that I can build independently, while aiming for early profitability. As getting customers for any product requires marketing, I figured I could improve my marketing skills if I started by selling a product that already has a market: My services as a Data \u0026 AI Consultant. As the way I currently present myself is somewhat generic, my thinking was that this would be a good opportunity to improve my positioning by building dedicated landing pages for specific services and target audiences – something that’s necessary for any product.\nI chatted a bit with Bing on keyword research and this is one of the things it said:\nFirst of all, I think you should define your target audience and your value proposition clearly. Who are you trying to help with your consulting site? What problems can you solve for them? What benefits can they get from booking a call with you? These questions will help you craft a compelling message that attracts the right leads.\nOne of the resources it cited was a post on consulting best practices from Consulting Success, where some of the tips are around following proven processes, offering productised solutions, and developing a Magnetic Message, which they formulate as “I help [WHO] to [solve WHAT problem] so they can [see WHAT results]. My [WHY choose me]…” Another post by Consulting Success talks about building a consulting website, with the first step being to “understand your ideal client”.\nThis sort of advice might be obvious, but I suppose it’s needed because it’s easy to fall into the trap of making a consulting site about the consultant rather than about the ideal client and their problems. The parallel in the product world is starting from a product idea rather than from customer needs, which many people who enjoy building stuff tend to do.\nThe ideal bit is worth emphasising because going too broad can also be a mistake for independent consultants and product builders. Jonathan Stark addresses this point concisely, saying that “the only business strategy you’ll ever need [is to] help people you like get what they want.” This resonates with me since one of the reasons I want to work independently is to have choice over who I help – I find it hard to deal with the moral conundrums that come with working in a business that sells technology solutions to pretty much anyone (as long it’s legal).\nStarting with decisions on a niche and ideal clients is in line with the advice to product-focused micropreneurs from Start Small, Stay Small. It makes sense: If one of the aims is to keep the headcount low (starting from one and perhaps staying there), it’s impossible to effectively target a broad market. This is regardless of whether the product sold is software subscriptions or consulting services. What wasn’t obvious to me until I came across Consulting Success and similar resources is how similar solo consulting can be to solo product building. I thought of consulting more as freelancing or contracting – selling time/effort for money. But thinking of consulting as achieving results and solving problems for clients brings it much closer to the product realm. Further, through consulting, it’s possible to uncover problems that are shared by many clients, which can become the basis for a pure self-service software solution.\n","wordCount":"566","inLanguage":"en","datePublished":"2023-09-25T00:00:00Z","dateModified":"2023-09-25T11:15:26+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The lines between solo consulting and product building are blurry</h1><div class=post-meta><span title='2023-09-25 00:00:00 +0000 UTC'>September 25, 2023</span></div></header><div class=post-content><p>I&rsquo;ve been thinking about <a href=https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/>starting a small product business recently</a> – something niche enough that I can build independently, while aiming for early profitability. As getting customers for any product requires marketing, I figured I could improve my marketing skills if I started by selling a product that already has a market: My services as a Data & AI Consultant. As the way I currently present myself is somewhat generic, my thinking was that this would be a good opportunity to improve my positioning by building dedicated landing pages for specific services and target audiences – something that&rsquo;s necessary for any product.</p><p>I chatted a bit with Bing on keyword research and this is one of the things it said:</p><blockquote><p>First of all, I think you should define your target audience and your value proposition clearly. Who are you trying to help with your consulting site? What problems can you solve for them? What benefits can they get from booking a call with you? These questions will help you craft a compelling message that attracts the right leads.</p></blockquote><p>One of the resources it cited was <a href=https://www.consultingsuccess.com/consulting-best-practices target=_blank rel=noopener>a post on consulting best practices from <em>Consulting Success</em></a>, where some of the tips are around following proven processes, offering productised solutions, and developing a <em>Magnetic Message</em>, which they formulate as <em>&ldquo;I help [WHO] to [solve WHAT problem] so they can [see WHAT results]. My [WHY choose me]&mldr;&rdquo;</em> Another post by <em>Consulting Success</em> talks about <a href=https://www.consultingsuccess.com/how-to-build-a-consulting-website target=_blank rel=noopener>building a consulting website</a>, with the first step being to <em>&ldquo;understand your ideal client&rdquo;</em>.</p><p>This sort of advice might be obvious, but I suppose it&rsquo;s needed because it&rsquo;s easy to fall into the trap of making a consulting site about the consultant rather than about the <em>ideal</em> client and their problems. The parallel in the product world is starting from a product idea rather than from customer needs, which many people who enjoy building stuff tend to do.</p><p>The <em>ideal</em> bit is worth emphasising because going too broad can also be a mistake for independent consultants and product builders. <a href=https://jonathanstark.com/daily/20200504-1409-the-only-business-strategy-youll-ever-need target=_blank rel=noopener>Jonathan Stark addresses this point concisely</a>, saying that <em>&ldquo;the only business strategy you&rsquo;ll ever need [is to] help people you like get what they want.&rdquo;</em> This resonates with me since one of the reasons I want to work independently is to have choice over who I help – I find it hard to deal with the moral conundrums that come with <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>working in a business that sells technology solutions to pretty much anyone</a> (as long it&rsquo;s legal).</p><p>Starting with decisions on a niche and ideal clients is in line with <a href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/>the advice to product-focused micropreneurs from <em>Start Small, Stay Small</em></a>. It makes sense: If one of the aims is to keep the headcount low (starting from one and perhaps staying there), it&rsquo;s impossible to effectively target a broad market. This is regardless of whether the product sold is software subscriptions or consulting services. What wasn&rsquo;t obvious to me until I came across <em>Consulting Success</em> and similar resources is how similar solo consulting can be to solo product building. I thought of consulting more as <a href=https://www.consultingsuccess.com/consultant-vs-freelancer target=_blank rel=noopener>freelancing or contracting</a> – selling time/effort for money. But thinking of consulting as achieving results and solving problems for clients brings it much closer to the product realm. Further, through consulting, it&rsquo;s possible to uncover problems that are shared by many clients, which can become the basis for a pure self-service software solution.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on x" href="https://x.com/intent/tweet/?text=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f&amp;hashtags=business%2ccareer%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f&amp;title=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry&amp;summary=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f&title=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on whatsapp" href="https://api.whatsapp.com/send?text=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on telegram" href="https://telegram.me/share/url?text=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The lines between solo consulting and product building are blurry on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20lines%20between%20solo%20consulting%20and%20product%20building%20are%20blurry&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f09%2f25%2fthe-lines-between-solo-consulting-and-product-building-are-blurry%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/index.html b/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/index.html
index f2cd847e2..ff06ff745 100644
--- a/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/index.html
+++ b/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,ethics,marketing,quotes"><meta name=description content="Replacing &lsquo;artificial intelligence&rsquo; with &lsquo;automation&rsquo; is a useful trick for cutting through the hype."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Artificial intelligence was a marketing term all along – just call it automation"><meta property="og:description" content="Replacing &lsquo;artificial intelligence&rsquo; with &lsquo;automation&rsquo; is a useful trick for cutting through the hype."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-10-06T05:00:00+00:00"><meta property="article:modified_time" content="2023-10-06T15:11:27+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Artificial intelligence was a marketing term all along – just call it automation"><meta name=twitter:description content="Replacing &lsquo;artificial intelligence&rsquo; with &lsquo;automation&rsquo; is a useful trick for cutting through the hype."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Artificial intelligence was a marketing term all along – just call it automation","item":"https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Artificial intelligence was a marketing term all along – just call it automation","name":"Artificial intelligence was a marketing term all along – just call it automation","description":"Replacing \u0026lsquo;artificial intelligence\u0026rsquo; with \u0026lsquo;automation\u0026rsquo; is a useful trick for cutting through the hype.","keywords":["artificial intelligence","ethics","marketing","quotes"],"articleBody":"Quoting Emily M. Bender:\nWhat is AI?\nIn fact this is a marketing term. It’s a way to make certain kinds of automation sound sophisticated, powerful, or magical and as such it’s a way to dodge accountability by making the machines sound like autonomous thinking entities rather than tools that are created and used by people and companies. It’s also the name of a subfield of computer science concerned with making machines that “think like humans” but even there it was started as a marketing term in the 1950s to attract research funding to that field.\nI think that discussions of this technology become much clearer when we replace the term AI with the word “automation”. Then we can ask:\nWhat is being automated? Who’s automating it and why? Who benefits from that automation? How well does the automation work in its use case that we’re considering? Who’s being harmed? Who has accountability for the functioning of the automated system? What existing regulations already apply to the activities where the automation is being used? Marketing isn’t a bad thing – it all depends on what you market and the tactics you use. But getting to the bottom of things does require going beyond marketing lingo, which Bender does well in the above quote.\nI was curious about her claim that AI started as a marketing term. After some searching, I got to the Wikipedia page on the 1956 Dartmouth Workshop, which says that:\nIn 1955, John McCarthy, then a young Assistant Professor of Mathematics at Dartmouth College, decided to organize a group to clarify and develop ideas about thinking machines. He picked the name ‘Artificial Intelligence’ for the new field. He chose the name partly for its neutrality; avoiding a focus on narrow automata theory, and avoiding cybernetics which was heavily focused on analog feedback, as well as him potentially having to accept the assertive Norbert Wiener as guru or having to argue with him.\nI suppose that applying for research funding is a form of marketing, which is in line with Bender’s claim.\nIn any case, talking about automation of specific tasks rather than about “using AI” is a useful trick that I’ll be using in concrete discussions. Unfortunately, I’ll also keep using the AI marketing term where suitable, e.g., in my current title of Data \u0026 AI Consultant. After all, titles are a part of marketing.\n","wordCount":"398","inLanguage":"en","datePublished":"2023-10-06T05:00:00Z","dateModified":"2023-10-06T15:11:27+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Artificial intelligence was a marketing term all along – just call it automation</h1><div class=post-meta><span title='2023-10-06 05:00:00 +0000 UTC'>October 6, 2023</span></div></header><div class=post-content><p>Quoting <a href=https://medium.com/@emilymenonbender/opening-remarks-on-ai-in-the-workplace-new-crisis-or-longstanding-challenge-eb81d1bee9f target=_blank rel=noopener>Emily M. Bender</a>:</p><blockquote><p><strong>What is AI?</strong></p><p>In fact this is a marketing term. It&rsquo;s a way to make certain kinds of automation sound sophisticated, powerful, or magical and as such it&rsquo;s a way to dodge accountability by making the machines sound like autonomous thinking entities rather than tools that are created and used by people and companies. It&rsquo;s also the name of a subfield of computer science concerned with making machines that &ldquo;think like humans&rdquo; but even there it was started as a marketing term in the 1950s to attract research funding to that field.</p><p>I think that discussions of this technology become much clearer when we replace the term AI with the word &ldquo;automation&rdquo;. Then we can ask:</p><ul><li>What is being automated?</li><li>Who&rsquo;s automating it and why?</li><li>Who benefits from that automation?</li><li>How well does the automation work in its use case that we&rsquo;re considering?</li><li>Who&rsquo;s being harmed?</li><li>Who has accountability for the functioning of the automated system?</li><li>What existing regulations already apply to the activities where the automation is being used?</li></ul></blockquote><p>Marketing isn&rsquo;t a bad thing – it all depends on what you market and the tactics you use. But getting to the bottom of things does require going beyond marketing lingo, which Bender does well in the above quote.</p><p>I was curious about her claim that AI started as a marketing term. After some searching, I got to the Wikipedia page on <a href=https://en.wikipedia.org/wiki/Dartmouth_workshop target=_blank rel=noopener>the 1956 Dartmouth Workshop</a>, which says that:</p><blockquote><p>In 1955, John McCarthy, then a young Assistant Professor of Mathematics at Dartmouth College, decided to organize a group to clarify and develop ideas about thinking machines. He picked the name &lsquo;Artificial Intelligence&rsquo; for the new field. He chose the name partly for its neutrality; avoiding a focus on narrow automata theory, and avoiding cybernetics which was heavily focused on analog feedback, as well as him potentially having to accept the assertive Norbert Wiener as guru or having to argue with him.</p></blockquote><p>I suppose that applying for research funding is a form of marketing, which is in line with Bender&rsquo;s claim.</p><p>In any case, talking about automation of specific tasks rather than about &ldquo;using AI&rdquo; is a useful trick that I&rsquo;ll be using in concrete discussions. Unfortunately, I&rsquo;ll also keep using the AI marketing term where suitable, e.g., in my current title of <em>Data & AI Consultant</em>. After all, titles are a part of marketing.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/ethics/>Ethics</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on x" href="https://x.com/intent/tweet/?text=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f&amp;hashtags=artificialintelligence%2cethics%2cmarketing%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f&amp;title=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation&amp;summary=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f&title=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on whatsapp" href="https://api.whatsapp.com/send?text=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on telegram" href="https://telegram.me/share/url?text=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Artificial intelligence was a marketing term all along – just call it automation on ycombinator" href="https://news.ycombinator.com/submitlink?t=Artificial%20intelligence%20was%20a%20marketing%20term%20all%20along%20%e2%80%93%20just%20call%20it%20automation&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f10%2f06%2fartificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/index.html b/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/index.html
index 508144171..b4406abf9 100644
--- a/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/index.html
+++ b/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="data engineering,data science,Reef Life Survey,software engineering,web development"><meta name=description content="For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="You don't need a proprietary API for static maps"><meta property="og:description" content="For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-11-21T06:00:00+00:00"><meta property="article:modified_time" content="2023-11-21T16:12:27+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="You don't need a proprietary API for static maps"><meta name=twitter:description content="For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"You don't need a proprietary API for static maps","item":"https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"You don't need a proprietary API for static maps","name":"You don\u0027t need a proprietary API for static maps","description":"For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps.","keywords":["data engineering","data science","Reef Life Survey","software engineering","web development"],"articleBody":"In addition to my long-time volunteering as a scuba diver with Reef Life Survey (RLS), I’ve been helping them with bits and pieces around data engineering / science / web work (somewhat reluctantly). A few months ago, we discovered an issue in the PDF field guides exported from Reef Species of the World: species distribution maps embedded in the PDFs were broken. This was due to a change in Google’s Static Maps API, which became a paid feature.\nWhile paying for the Google API would have been the simplest solution (and fairly cheap with appropriate caching), this was an opportunity to improve the static map functionality while removing the paid proprietary API dependency. For me, it was also an opportunity to learn a bit about geospatial analysis in Python, which I’ve been curious about.\nA bit more context on the problem: Each of the species in the PDF field guides is shown with its distribution map, as recorded in the RLS dataset. Some species are widespread and common, like Labroides dimidatus (a cleaner wrasse that was recorded in over 4,000 sites). Unlike the maps presented on the web version, PDF maps are meant to fit a box that’s about 4.5cm by 3.5cm when printed, so space is limited for on-map labels.\nAn important limitation of the Google Static Maps API (which is shared by the cheaper Mapbox API) is that of request URL length. This isn’t an issue for maps with a few custom features, but requesting a map with thousands of markers isn’t feasible without reducing coordinate accuracy and clustering markers to reduce their number. This complicates the code that calls the static mapping API, and can easily lead to unexpected results, like fish found on dry land.\nI suppose that an attractive feature of the Google Static Maps API is the simplicity of embedding maps in pure front-end applications, as it obviates the need to implement a back-end to generate the static maps. However, this feature was irrelevant to the PDF generation task, which happens on the back-end anyway.\nOnce I understood the downsides of sticking with proprietary static map APIs (including their limited customisability), I realised I could expand the Python data processing code in the rls-data repo to pre-generate all the maps whenever new survey data becomes available. The final result was about 5,000 distribution maps that are committed to the repo. This admittedly stretches the common use cases for Git repos, but at about 15KB per map, it’s not terrible. In any case, it’d be easy to store the maps on S3 if needed.\nThe full code with the change, including the GitHub Action that refreshes the maps, is in this PR. It’s a bit hard to navigate since GitHub doesn’t like PRs with thousands of files, but the commit history gives the full picture of my experimentation with Python-based mapping solutions. The map generation code that ended up getting merged starts here.\nPython has a large ecosystem of geospatial packages, so choosing the right packages for the use case was a bit tricky. However, I heard about geopandas, so I used it for my first round of experiments. I got reasonable-looking maps, but it was a bit slow. I also found the auto-zoom functionality frustrating – given the space constraints, balancing the zoom level with keeping a constant aspect ratio and the need for legibility seemed non-trivial (at least to me).\nSome discussions on zooming with ChatGPT led to it mentioning cartopy. I was quickly sold on it given all the pretty maps in the cartopy gallery. It also turned out to be much faster – generating the maps with cached tiles (using geopandas and contextily) was six times slower than using cartopy with Natural Earth features. The cartopy solution was also twice as fast as using geopandas with Natural Earth, and I could easily set the colour of the ocean to match the colour used on the RLS website – definitely a winner! A full run to regenerate all the maps with a standard GitHub Actions runner takes about 3.5 minutes, which is reasonable for something that runs at most daily.\nI’m far from a geospatial expert, so the solution I landed on for zooming with a constant aspect ratio isn’t great: There are a few hard-coded map areas with recognisable coastlines (Australia, Europe, North America, etc.), which obviates the need for labelling. For each species distribution, the code chooses the minimal area that fits all the sites. I find that it reduces the mental overload in comparison to auto-zoom when looking at a bunch of maps in the context of the PDF, but we may swap this for another solution. For now, it’s good enough, and a definite improvement over broken maps.\n","wordCount":"790","inLanguage":"en","datePublished":"2023-11-21T06:00:00Z","dateModified":"2023-11-21T16:12:27+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">You don't need a proprietary API for static maps</h1><div class=post-meta><span title='2023-11-21 06:00:00 +0000 UTC'>November 21, 2023</span></div></header><div class=post-content><p>In addition to my long-time volunteering as a scuba diver with <a href=https://reeflifesurvey.com/ target=_blank rel=noopener>Reef Life Survey</a> (RLS), I&rsquo;ve been helping them with bits and pieces around data engineering / science / web work (<a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>somewhat reluctantly</a>). A few months ago, we discovered an issue in the PDF field guides exported from <a href=https://reeflifesurvey.com/species/ target=_blank rel=noopener>Reef Species of the World</a>: species distribution maps embedded in the PDFs were broken. This was due to a change in <a href=https://developers.google.com/maps/documentation/maps-static/overview target=_blank rel=noopener>Google&rsquo;s Static Maps API</a>, which became a paid feature.</p><p>While paying for the Google API would have been the simplest solution (and fairly cheap with appropriate caching), this was an opportunity to improve the static map functionality while removing the paid proprietary API dependency. For me, it was also an opportunity to learn a bit about geospatial analysis in Python, which I&rsquo;ve been curious about.</p><p>A bit more context on the problem: Each of the species in the PDF field guides is shown with its distribution map, as recorded in the RLS dataset. Some species are widespread and common, like <a href=https://reeflifesurvey.com/species/labroides-dimidiatus/ target=_blank rel=noopener><em>Labroides dimidatus</em></a> (a cleaner wrasse that was recorded in over 4,000 sites). Unlike the maps presented on the web version, PDF maps are meant to fit a box that&rsquo;s about 4.5cm by 3.5cm when printed, so space is limited for on-map labels.</p><p>An important limitation of the Google Static Maps API (which is shared by the cheaper Mapbox API) is that of request URL length. This isn&rsquo;t an issue for maps with a few custom features, but requesting a map with thousands of markers isn&rsquo;t feasible without reducing coordinate accuracy and clustering markers to reduce their number. This complicates the code that calls the static mapping API, and can easily lead to unexpected results, like fish found on dry land.</p><p>I suppose that an attractive feature of the Google Static Maps API is the simplicity of embedding maps in pure front-end applications, as it obviates the need to implement a back-end to generate the static maps. However, this feature was irrelevant to the PDF generation task, which happens on the back-end anyway.</p><p>Once I understood the downsides of sticking with proprietary static map APIs (including their limited customisability), I realised I could expand the Python data processing code in <a href=https://github.com/yanirs/rls-data target=_blank rel=noopener>the <code>rls-data</code> repo</a> to pre-generate all the maps whenever new survey data becomes available. The final result was <a href=https://github.com/yanirs/rls-data/tree/master/maps target=_blank rel=noopener>about 5,000 distribution maps that are committed to the repo</a>. This admittedly stretches the common use cases for Git repos, but at about 15KB per map, it&rsquo;s not terrible. In any case, it&rsquo;d be easy to store the maps on S3 if needed.</p><p>The full code with the change, including the GitHub Action that refreshes the maps, is in <a href=https://github.com/yanirs/rls-data/pull/36 target=_blank rel=noopener>this PR</a>. It&rsquo;s a bit hard to navigate since GitHub doesn&rsquo;t like PRs with thousands of files, but the commit history gives the full picture of my experimentation with Python-based mapping solutions. The map generation code that ended up getting merged starts <a href=https://github.com/yanirs/rls-data/blob/ac0eec5988efeaa95347371002226574cc6c7ff9/rls/processor.py#L295 target=_blank rel=noopener>here</a>.</p><p>Python has <a href=https://ecosystem.pythongis.org/ target=_blank rel=noopener>a large ecosystem of geospatial packages</a>, so choosing the right packages for the use case was a bit tricky. However, I heard about <a href=https://geopandas.org/en/stable/ target=_blank rel=noopener><code>geopandas</code></a>, so I used it for my first round of experiments. I got reasonable-looking maps, but it was a bit slow. I also found the auto-zoom functionality frustrating – given the space constraints, balancing the zoom level with keeping a constant aspect ratio and the need for legibility seemed non-trivial (at least to me).</p><p>Some discussions on zooming with ChatGPT led to it mentioning <a href=https://scitools.org.uk/cartopy/docs/latest/index.html target=_blank rel=noopener><code>cartopy</code></a>. I was quickly sold on it <a href=https://scitools.org.uk/cartopy/docs/latest/gallery/index.html target=_blank rel=noopener>given all the pretty maps in the <code>cartopy</code> gallery</a>. It also turned out to be much faster – generating the maps with cached tiles (using <code>geopandas</code> and <a href=https://contextily.readthedocs.io/en/latest/ target=_blank rel=noopener><code>contextily</code></a>) was six times slower than <a href=https://scitools.org.uk/cartopy/docs/latest/gallery/lines_and_polygons/features.html target=_blank rel=noopener>using <code>cartopy</code> with Natural Earth features</a>. The <code>cartopy</code> solution was also twice as fast as using geopandas with Natural Earth, and I could easily set the colour of the ocean to match the colour used on the RLS website – definitely a winner! A full run to regenerate all the maps with a standard GitHub Actions runner takes about 3.5 minutes, which is reasonable for something that runs at most daily.</p><p>I&rsquo;m far from a geospatial expert, so the solution I landed on for zooming with a constant aspect ratio isn&rsquo;t great: There are a few hard-coded map areas with recognisable coastlines (Australia, Europe, North America, etc.), which obviates the need for labelling. For each species distribution, the code chooses the minimal area that fits all the sites. I find that it reduces the mental overload in comparison to auto-zoom when looking at a bunch of maps in the context of the PDF, but we may swap this for another solution. For now, it&rsquo;s good enough, and a definite improvement over broken maps.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li><li><a href=https://yanirseroussi.com/tags/reef-life-survey/>Reef Life Survey</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li><li><a href=https://yanirseroussi.com/tags/web-development/>Web Development</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on x" href="https://x.com/intent/tweet/?text=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f&amp;hashtags=dataengineering%2cdatascience%2cReefLifeSurvey%2csoftwareengineering%2cwebdevelopment"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f&amp;title=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps&amp;summary=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f&title=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on whatsapp" href="https://api.whatsapp.com/send?text=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on telegram" href="https://telegram.me/share/url?text=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share You don't need a proprietary API for static maps on ycombinator" href="https://news.ycombinator.com/submitlink?t=You%20don%27t%20need%20a%20proprietary%20API%20for%20static%20maps&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f21%2fyou-dont-need-a-proprietary-api-for-static-maps%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/index.html b/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/index.html
index fd0df8a22..8624ed40a 100644
--- a/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/index.html
+++ b/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,environment,marine science,quotes"><meta name=description content="One of my many highlights from Helen Czerski&rsquo;s Blue Machine."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Our Blue Machine is changing, but we are not helpless"><meta property="og:description" content="One of my many highlights from Helen Czerski&rsquo;s Blue Machine."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-11-28T06:40:00+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:31+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Our Blue Machine is changing, but we are not helpless"><meta name=twitter:description content="One of my many highlights from Helen Czerski&rsquo;s Blue Machine."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Our Blue Machine is changing, but we are not helpless","item":"https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Our Blue Machine is changing, but we are not helpless","name":"Our Blue Machine is changing, but we are not helpless","description":"One of my many highlights from Helen Czerski\u0026rsquo;s Blue Machine.","keywords":["books","environment","marine science","quotes"],"articleBody":"Quoting the final chapter of Helen Czerski’s Blue Machine:\nWhile doing the research for almost every story in this book, I found that the latest scientific research papers on each topic started by discussing how those systems are changing. The giant ocean engine will keep turning, but its delicate equilibrium and the ways life is woven through it are not fixed. It must turn, but it doesn’t have to turn like this in every detail. And yet this is a system of huge richness, and physical oceanography and evolution would take a long time to replace that bounty if we lost it. We don’t understand everything about the global ocean, but we certainly do understand enough to know how valuable it is and the most obvious ways in which to protect it. The primary reason for setting out the damage we have inflicted on the blue machine is not to shock. It’s to lift us out of helplessness. As we acquire knowledge, we also acquire the grounds for optimism.\nThere’s so much to love about the book, as there is so much to love about the earth’s Blue Machine. The bulk of the book comprises stories that explain how the ocean works – from the physics and chemistry of seawater through to oceanic messengers, passengers, and voyagers. One quote from the book doesn’t do it justice – just go read it yourself.\n","wordCount":"232","inLanguage":"en","datePublished":"2023-11-28T06:40:00Z","dateModified":"2024-03-12T16:33:31+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Our Blue Machine is changing, but we are not helpless</h1><div class=post-meta><span title='2023-11-28 06:40:00 +0000 UTC'>November 28, 2023</span></div></header><div class=post-content><p>Quoting the final chapter of <a href=https://www.helenczerski.net/books-writing target=_blank rel=noopener>Helen Czerski&rsquo;s Blue Machine</a>:</p><blockquote><p>While doing the research for almost every story in this book, I found that the latest scientific research papers on each topic started by discussing how those systems are changing. The giant ocean engine will keep turning, but its delicate equilibrium and the ways life is woven through it are not fixed. It must turn, but it doesn&rsquo;t have to turn like <em>this</em> in every detail. And yet <em>this</em> is a system of huge richness, and physical oceanography and evolution would take a long time to replace that bounty if we lost it. We don&rsquo;t understand everything about the global ocean, but we certainly do understand enough to know how valuable it is and the most obvious ways in which to protect it. The primary reason for setting out the damage we have inflicted on the blue machine is not to shock. It&rsquo;s to lift us out of helplessness. As we acquire knowledge, we also acquire the grounds for optimism.</p></blockquote><p>There&rsquo;s so much to love about the book, as there is so much to love about the earth&rsquo;s Blue Machine. The bulk of the book comprises stories that explain how the ocean works – from the physics and chemistry of seawater through to oceanic messengers, passengers, and voyagers. One quote from the book doesn&rsquo;t do it justice – just go read it yourself.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/marine-science/>Marine Science</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on x" href="https://x.com/intent/tweet/?text=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f&amp;hashtags=books%2cenvironment%2cmarinescience%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f&amp;title=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless&amp;summary=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f&title=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on whatsapp" href="https://api.whatsapp.com/send?text=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on telegram" href="https://telegram.me/share/url?text=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Our Blue Machine is changing, but we are not helpless on ycombinator" href="https://news.ycombinator.com/submitlink?t=Our%20Blue%20Machine%20is%20changing%2c%20but%20we%20are%20not%20helpless&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f11%2f28%2four-blue-machine-is-changing-but-we-are-not-helpless%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/index.html b/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/index.html
index 60cfd45a3..f091ca85b 100644
--- a/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/index.html
+++ b/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="energy markets,machine learning,quotes"><meta name=description content="An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Transfer learning applies to energy market bidding"><meta property="og:description" content="An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-12-14T00:15:00+00:00"><meta property="article:modified_time" content="2023-12-14T10:46:41+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Transfer learning applies to energy market bidding"><meta name=twitter:description content="An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Transfer learning applies to energy market bidding","item":"https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Transfer learning applies to energy market bidding","name":"Transfer learning applies to energy market bidding","description":"An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland.","keywords":["energy markets","machine learning","quotes"],"articleBody":"Quoting the abstract of Transferable Energy Storage Bidder by Yousuf Baker, Ningkun Zheng, and Bolun Xu:\nEnergy storage resources must consider both price uncertainties and their physical operating characteristics when participating in wholesale electricity markets. This is a challenging problem as electricity prices are highly volatile, and energy storage has efficiency losses, power, and energy constraints. This paper presents a novel, versatile, and transferable approach combining model-based optimization with a convolutional long short-term memory network for energy storage to respond to or bid into wholesale electricity markets. We test our proposed approach using historical prices from New York State, showing it achieves state-of-the-art results, achieving between 70% to near 90% profit ratio compared to perfect foresight cases, in both price response and wholesale market bidding setting with various energy storage durations. We also test a transfer learning approach by pre-training the bidding model using New York data and applying it to arbitrage in Queensland, Australia. The result shows transfer learning achieves exceptional arbitrage profitability with as little as three days of local training data, demonstrating its significant advantage over training from scratch in scenarios with very limited data availability.\nI’m not sure about the practical implications, but it’s interesting that data from New York is informative for Queensland. I also found the approach of predicting the opportunity value function (rather than forecasting prices) to be clever. However, I’m not familiar with research in the field, so this may be standard practice. Still, I will refer back to this paper and its references if I go deeper into energy market bidding.\n","wordCount":"260","inLanguage":"en","datePublished":"2023-12-14T00:15:00Z","dateModified":"2023-12-14T10:46:41+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Transfer learning applies to energy market bidding</h1><div class=post-meta><span title='2023-12-14 00:15:00 +0000 UTC'>December 14, 2023</span></div></header><div class=post-content><p>Quoting the abstract of <a href=https://arxiv.org/abs/2301.01233 target=_blank rel=noopener>Transferable Energy Storage Bidder</a> by Yousuf Baker, Ningkun Zheng, and Bolun Xu:</p><blockquote><p>Energy storage resources must consider both price uncertainties and their physical operating characteristics when participating in wholesale electricity markets. This is a challenging problem as electricity prices are highly volatile, and energy storage has efficiency losses, power, and energy constraints. This paper presents a novel, versatile, and transferable approach combining model-based optimization with a convolutional long short-term memory network for energy storage to respond to or bid into wholesale electricity markets. We test our proposed approach using historical prices from New York State, showing it achieves state-of-the-art results, achieving between 70% to near 90% profit ratio compared to perfect foresight cases, in both price response and wholesale market bidding setting with various energy storage durations. We also test a transfer learning approach by pre-training the bidding model using New York data and applying it to arbitrage in Queensland, Australia. The result shows transfer learning achieves exceptional arbitrage profitability with as little as three days of local training data, demonstrating its significant advantage over training from scratch in scenarios with very limited data availability.</p></blockquote><p>I&rsquo;m not sure about the practical implications, but it&rsquo;s interesting that data from New York is informative for Queensland. I also found the approach of predicting the opportunity value function (rather than forecasting prices) to be clever. However, I&rsquo;m not familiar with research in the field, so this may be standard practice. Still, I will refer back to this paper and its references if I go deeper into energy market bidding.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/energy-markets/>Energy Markets</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on x" href="https://x.com/intent/tweet/?text=Transfer%20learning%20applies%20to%20energy%20market%20bidding&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f&amp;hashtags=energymarkets%2cmachinelearning%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f&amp;title=Transfer%20learning%20applies%20to%20energy%20market%20bidding&amp;summary=Transfer%20learning%20applies%20to%20energy%20market%20bidding&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f&title=Transfer%20learning%20applies%20to%20energy%20market%20bidding"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on whatsapp" href="https://api.whatsapp.com/send?text=Transfer%20learning%20applies%20to%20energy%20market%20bidding%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on telegram" href="https://telegram.me/share/url?text=Transfer%20learning%20applies%20to%20energy%20market%20bidding&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Transfer learning applies to energy market bidding on ycombinator" href="https://news.ycombinator.com/submitlink?t=Transfer%20learning%20applies%20to%20energy%20market%20bidding&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f14%2ftransfer-learning-applies-to-energy-market-bidding%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/index.html b/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/index.html
index f5fb5d01f..9ae2c4325 100644
--- a/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/index.html
+++ b/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,data business,data science"><meta name=description content="With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Positioning is a common problem for data scientists"><meta property="og:description" content="With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/"><meta property="article:section" content="til"><meta property="article:published_time" content="2023-12-18T00:30:00+00:00"><meta property="article:modified_time" content="2023-12-18T10:38:56+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Positioning is a common problem for data scientists"><meta name=twitter:description content="With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Positioning is a common problem for data scientists","item":"https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Positioning is a common problem for data scientists","name":"Positioning is a common problem for data scientists","description":"With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark.","keywords":["business","career","data business","data science"],"articleBody":"I became a data scientist by accident: I followed my curiosity and did a PhD in computational linguistics and recommender systems. When I finished my PhD in 2012, I discovered I could call myself a data scientist rather than a software engineer with a research background (which was a bit of a mouthful). As 2012 was the year Harvard Business Review declared data scientist to be the sexiest job of the 21st century, I didn’t need to think much about what kind of data scientist I was. Just being a data scientist was pretty unique.\nThe world has changed in the past eleven years, and now there are many more data scientists. While you could earn good money as a generic data scientist, you don’t stand out. That is, it’s not only that software commodities are replacing interesting data science work, and that large language models are making some skills irrelevant – whatever is left of the core data science skillset has become an undifferentiated commodity.\nI’ve been thinking a lot about positioning as an independent consultant recently, after realising that the lines between solo consulting and product building are blurry. One great source to learn more on the topic is Jonathan Stark, who has published many valuable resources over the years. Among them, I found a podcast interview he did in May this year with Genevieve Hayes, titled Building Your Authority in Data Science.\nWhether you’re an employee or independent data scientist, it’s worth listening to the interview. Here are my key takeaways:\nEven though data scientists are already highly specialised, the problem of positioning oneself and standing out is common. Understanding marketing and the business side in addition to mastering the technical skills can be a superpower, as you can act as a bridge between non-technical people and the “nerds”. You need to be perceived as meaningfully different by your target audience, regardless of whether you choose to specialise in a horizontal (specific data science skill like computer vision) or in a vertical (specific industry like renewable energy). If your target audience doesn’t find you meaningfully different, you have more work to do. Avoid basing your self-worth on where you sit compared to other data scientists. If you’re good enough technically (C to B+) and you have complementary skills and an outcome-driven mindset, you’d be unstoppable. This still seems rare. Publish stuff that business people can understand, i.e., connect what you can do on the technical side with business value. You don’t need to be managing people to deliver results, e.g., Jonathan chose to remain solo and not hire employees. Focusing on business results is what matters. At the time of the interview (May 2023), Jonathan was searching for a ChatGPT consultant to learn whether he could turn his content into a chatbot. He was surprised he could barely find anyone. This is a good example of riding a hype cycle, as being an early authority on ChatGPT can lead to solid business outcomes for indie consultants. However, given the nature of hype cycles, this can change quickly. Do things you’re deeply curious about, as enthusiasm helps you stand out. Going down rabbit holes can be a strength. In the context of ChatGPT consulting, this reminded me of Simon Willison and Ethan Mollick, who have become more well-known recently due to their curiosity and blogging on generative AI. There are many ways to make money, so you might as well work on something you like. Personally, I’m still figuring out my positioning. I found it interesting that Genevieve and Jonathan agreed that stereotypical data scientists care more about building their models than about how they’re used. While I enjoy the technical aspects of modelling and other data science tasks, I’m much more interested in shipping work that matters. That’s why I became a data scientist originally – I left academia after my PhD and joined startups to get close to business problems and build stuff people use. If that’s still a rarity, I suppose it can help with my positioning. That said, I’m also exploring a deeper vertical specialisation (currently looking at energy markets).\n","wordCount":"686","inLanguage":"en","datePublished":"2023-12-18T00:30:00Z","dateModified":"2023-12-18T10:38:56+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Positioning is a common problem for data scientists</h1><div class=post-meta><span title='2023-12-18 00:30:00 +0000 UTC'>December 18, 2023</span></div></header><div class=post-content><p>I <a href=https://yanirseroussi.com/2015/05/02/first-steps-in-data-science-author-aware-sentiment-analysis/>became a data scientist by accident</a>: I followed my curiosity and did a PhD in computational linguistics and recommender systems. When I finished my PhD in 2012, I discovered I could call myself a data scientist rather than <em>a software engineer with a research background</em> (which was a bit of a mouthful). As 2012 was the year <a href=https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century target=_blank rel=noopener>Harvard Business Review declared data scientist to be the sexiest job of the 21st century</a>, I didn&rsquo;t need to think much about what kind of data scientist I was. Just <em>being</em> a data scientist was pretty unique.</p><p>The world has changed in the past eleven years, and now there are many more data scientists. While you could earn good money as a generic data scientist, you don&rsquo;t stand out. That is, it&rsquo;s not only that <a href=https://yanirseroussi.com/2020/01/11/software-commodities-are-eating-interesting-data-science-work/>software commodities are replacing interesting data science work</a>, and that <a href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/>large language models are making some skills irrelevant</a> – whatever is left of the core data science skillset has become an undifferentiated commodity.</p><p>I&rsquo;ve been thinking a lot about positioning as an independent consultant recently, after realising that <a href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/>the lines between solo consulting and product building are blurry</a>. One great source to learn more on the topic is <a href=https://jonathanstark.com/ target=_blank rel=noopener>Jonathan Stark</a>, who has published many valuable resources over the years. Among them, I found a podcast interview he did in May this year with Genevieve Hayes, titled <a href=https://www.genevievehayes.com/podcast/ep14/ target=_blank rel=noopener>Building Your Authority in Data Science</a>.</p><p>Whether you&rsquo;re an employee or independent data scientist, it&rsquo;s worth listening to the interview. Here are my key takeaways:</p><ul><li>Even though data scientists are already highly specialised, the problem of positioning oneself and standing out is common.</li><li>Understanding marketing and the business side in addition to mastering the technical skills can be a superpower, as you can act as a bridge between non-technical people and the &ldquo;nerds&rdquo;.</li><li>You need to be perceived as meaningfully different <em>by your target audience</em>, regardless of whether you choose to specialise in a horizontal (specific data science skill like computer vision) or in a vertical (specific industry like renewable energy). If your target audience doesn&rsquo;t find you meaningfully different, you have more work to do.</li><li>Avoid basing your self-worth on where you sit compared to other data scientists. If you&rsquo;re good enough technically (C to B+) and you have complementary skills and an outcome-driven mindset, you&rsquo;d be unstoppable. This still seems rare.</li><li>Publish stuff that business people can understand, i.e., connect what you can do on the technical side with business value.</li><li>You don&rsquo;t need to be managing people to deliver results, e.g., Jonathan chose to remain solo and not hire employees. Focusing on business results is what matters.</li><li>At the time of the interview (May 2023), Jonathan was searching for a ChatGPT consultant to learn whether he could turn his content into a chatbot. He was surprised he could barely find anyone. This is a good example of riding a hype cycle, as being an early authority on ChatGPT can lead to solid business outcomes for indie consultants. However, given the nature of hype cycles, this can change quickly.</li><li>Do things you&rsquo;re deeply curious about, as enthusiasm helps you stand out. Going down rabbit holes can be a strength. In the context of ChatGPT consulting, this reminded me of <a href=https://simonwillison.net/ target=_blank rel=noopener>Simon Willison</a> and <a href=https://www.oneusefulthing.org/ target=_blank rel=noopener>Ethan Mollick</a>, who have become more well-known recently due to their curiosity and blogging on generative AI.</li><li>There are many ways to make money, so you might as well work on something you like.</li></ul><p>Personally, I&rsquo;m still figuring out my positioning. I found it interesting that Genevieve and Jonathan agreed that stereotypical data scientists care more about building their models than about how they&rsquo;re used. While I enjoy the technical aspects of modelling and other data science tasks, I&rsquo;m much more interested in shipping work that matters. That&rsquo;s why I became a data scientist originally – I left academia after my PhD and joined startups to get close to business problems and build stuff people use. If that&rsquo;s still a rarity, I suppose it can help with my positioning. That said, I&rsquo;m also exploring a deeper vertical specialisation (currently looking at energy markets).</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-business/>Data Business</a></li><li><a href=https://yanirseroussi.com/tags/data-science/>Data Science</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on x" href="https://x.com/intent/tweet/?text=Positioning%20is%20a%20common%20problem%20for%20data%20scientists&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f&amp;hashtags=business%2ccareer%2cdatabusiness%2cdatascience"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f&amp;title=Positioning%20is%20a%20common%20problem%20for%20data%20scientists&amp;summary=Positioning%20is%20a%20common%20problem%20for%20data%20scientists&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f&title=Positioning%20is%20a%20common%20problem%20for%20data%20scientists"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on whatsapp" href="https://api.whatsapp.com/send?text=Positioning%20is%20a%20common%20problem%20for%20data%20scientists%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on telegram" href="https://telegram.me/share/url?text=Positioning%20is%20a%20common%20problem%20for%20data%20scientists&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Positioning is a common problem for data scientists on ycombinator" href="https://news.ycombinator.com/submitlink?t=Positioning%20is%20a%20common%20problem%20for%20data%20scientists&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2023%2f12%2f18%2fpositioning-is-a-common-problem-for-data-scientists%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/01/08/the-power-of-parasocial-relationships/index.html b/til/2024/01/08/the-power-of-parasocial-relationships/index.html
index a96c467e9..4dcff0922 100644
--- a/til/2024/01/08/the-power-of-parasocial-relationships/index.html
+++ b/til/2024/01/08/the-power-of-parasocial-relationships/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,marketing"><meta name=description content="Repeated exposure to media personas creates relationships that help justify premium fees."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The power of parasocial relationships"><meta property="og:description" content="Repeated exposure to media personas creates relationships that help justify premium fees."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-01-08T06:00:00+00:00"><meta property="article:modified_time" content="2024-01-08T16:31:22+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The power of parasocial relationships"><meta name=twitter:description content="Repeated exposure to media personas creates relationships that help justify premium fees."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The power of parasocial relationships","item":"https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The power of parasocial relationships","name":"The power of parasocial relationships","description":"Repeated exposure to media personas creates relationships that help justify premium fees.","keywords":["business","career","marketing"],"articleBody":"I recently learned about the term parasocial relationship from The Business of Authority episode on The “Secret” Benefit of Podcasting. Quoting Wikipedia:\nParasocial interaction (PSI) refers to a kind of psychological relationship experienced by an audience in their mediated encounters with performers in the mass media, particularly on television and on online platforms. Viewers or listeners come to consider media personalities as friends, despite having no or limited interactions with them. PSI is described as an illusory experience, such that media audiences interact with personas (e.g., talk show hosts, celebrities, fictional characters, social media influencers) as if they are engaged in a reciprocal relationship with them. The term was coined by Donald Horton and Richard Wohl in 1956.\nA parasocial interaction, an exposure that garners interest in a persona, becomes a parasocial relationship after repeated exposure to the media persona causes the media user to develop illusions of intimacy, friendship, and identification. Positive information learned about the media persona results in increased attraction, and the relationship progresses. Parasocial relationships are enhanced due to trust and self-disclosure provided by the media persona.\nInterestingly, I feel exactly that sense of familiarity with Rochelle Moulton and Jonathan Stark, hosts of The Business of Authority. This is despite having no direct interactions with them. I only listened to a few episodes, read some of Jonathan’s materials, and signed up to his daily mailing list (which means he’s now a constant presence in my life).\nAs both Rochelle and Jonathan have been putting themselves out there for years, sharing quality content, and creating parasocial relationships with their target audience, they can charge premium fees for their expertise. At the time of this writing, Rochelle charges $1000 for a one-hour coaching session, while Jonathan charges $2500.\nIt’s embarrassingly simple, but incredibly powerful and applicable in a wide range of areas. Or in Jonathan’s words: “It takes time, but if you want to justify premium fees, becoming a celebrity in your space is a great way to grow your profits.”\n","wordCount":"333","inLanguage":"en","datePublished":"2024-01-08T06:00:00Z","dateModified":"2024-01-08T16:31:22+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The power of parasocial relationships</h1><div class=post-meta><span title='2024-01-08 06:00:00 +0000 UTC'>January 8, 2024</span></div></header><div class=post-content><p>I recently learned about the term <em>parasocial relationship</em> from <a href=https://www.thebusinessofauthority.com/episodes/the-secret-benefit-of-podcasting target=_blank rel=noopener>The Business of Authority episode on The &ldquo;Secret&rdquo; Benefit of Podcasting</a>. <a href=https://en.wikipedia.org/wiki/Parasocial_interaction target=_blank rel=noopener>Quoting Wikipedia</a>:</p><blockquote><p><strong>Parasocial interaction (PSI)</strong> refers to a kind of psychological relationship experienced by an audience in their mediated encounters with performers in the mass media, particularly on television and on online platforms. Viewers or listeners come to consider media personalities as friends, despite having no or limited interactions with them. PSI is described as an illusory experience, such that media audiences interact with personas (e.g., talk show hosts, celebrities, fictional characters, social media influencers) as if they are engaged in a reciprocal relationship with them. The term was coined by Donald Horton and Richard Wohl in 1956.</p><p>A parasocial interaction, an exposure that garners interest in a persona, becomes a <strong>parasocial relationship</strong> after repeated exposure to the media persona causes the media user to develop illusions of intimacy, friendship, and identification. Positive information learned about the media persona results in increased attraction, and the relationship progresses. Parasocial relationships are enhanced due to trust and self-disclosure provided by the media persona.</p></blockquote><p>Interestingly, I feel exactly that sense of familiarity with Rochelle Moulton and Jonathan Stark, hosts of The Business of Authority. This is despite having no direct interactions with them. I only listened to a few episodes, read some of Jonathan&rsquo;s materials, and signed up to his daily mailing list (which means he&rsquo;s now a constant presence in my life).</p><p>As both Rochelle and Jonathan have been putting themselves out there for years, sharing quality content, and creating parasocial relationships with their target audience, they can charge premium fees for their expertise. At the time of this writing, Rochelle charges $1000 for a one-hour coaching session, while Jonathan charges $2500.</p><p>It&rsquo;s embarrassingly simple, but incredibly powerful and applicable in a wide range of areas. Or in <a href=https://jonathanstark.com/daily/20231117-1500-born-famous target=_blank rel=noopener>Jonathan&rsquo;s words</a>: <em>&ldquo;It takes time, but if you want to justify premium fees, becoming a celebrity in your space is a great way to grow your profits.&rdquo;</em></p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on x" href="https://x.com/intent/tweet/?text=The%20power%20of%20parasocial%20relationships&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f&amp;hashtags=business%2ccareer%2cmarketing"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f&amp;title=The%20power%20of%20parasocial%20relationships&amp;summary=The%20power%20of%20parasocial%20relationships&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f&title=The%20power%20of%20parasocial%20relationships"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on whatsapp" href="https://api.whatsapp.com/send?text=The%20power%20of%20parasocial%20relationships%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on telegram" href="https://telegram.me/share/url?text=The%20power%20of%20parasocial%20relationships&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The power of parasocial relationships on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20power%20of%20parasocial%20relationships&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f08%2fthe-power-of-parasocial-relationships%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/index.html b/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/index.html
index 5add580cc..825aff8c6 100644
--- a/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/index.html
+++ b/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,environment,marketing,personal"><meta name=description content="When focusing on a market segment defined by personal beliefs, it&rsquo;s often fine to position yourself as a generalist in your craft."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Psychographic specialisations may work for discipline generalists"><meta property="og:description" content="When focusing on a market segment defined by personal beliefs, it&rsquo;s often fine to position yourself as a generalist in your craft."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-01-09T03:00:00+00:00"><meta property="article:modified_time" content="2024-01-09T13:23:28+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Psychographic specialisations may work for discipline generalists"><meta name=twitter:description content="When focusing on a market segment defined by personal beliefs, it&rsquo;s often fine to position yourself as a generalist in your craft."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Psychographic specialisations may work for discipline generalists","item":"https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Psychographic specialisations may work for discipline generalists","name":"Psychographic specialisations may work for discipline generalists","description":"When focusing on a market segment defined by personal beliefs, it\u0026rsquo;s often fine to position yourself as a generalist in your craft.","keywords":["business","career","environment","marketing","personal"],"articleBody":"The Business of Authority is a treasure trove of information for independent consultants. This morning, I listened to a 2019 episode titled Five Ways To Specialize, where hosts Jonathan Stark and Rochelle Moulton went deep into these approaches:1\nHorizontal Specialization: Niching Down on a skill that can be applied to a very broad range of client types. e.g., responsive web design, iOS development, MySQL administration. Platform Specialization: A subset of Horizontal Specialization that targets a tool or platform that your ideal buyer was involved in choosing. Shopify, WordPress, or Salesforce are examples of Platform Specializations because your buyer almost certainly was involved in choosing the technology in question. Tools like Photoshop, JavaScript, or MySQL are not Platform Specializations because the buyer probably didn’t directly or consciously choose them. Vertical Specialization: Niching Down on a market segment in which vendors offer goods and services specific to an industry, trade, profession, or other group of customers with specialized needs. Typical examples of buyers in a vertical market would be quick service restaurants, ski resorts, pet shelters, auto repair shops, and so on. Demographic Specialization: Niching Down on a market segment defined by personal attributes of an individual (e.g., Baby Boomers, New York residents, millionaires, Asian Americans, migraine sufferers). Psychographic Specialization: Niching Down on a market segment defined by personal beliefs, attitudes, or behaviors of an individual (e.g., environmentalists, skeptics, flat-earthers, dreamers). I discovered the episode because I wanted to understand psychographic specialisations better, as part of figuring out my current positioning. In my first round as an independent consultant (back in 2014-2015), I specialised horizontally as a data scientist. These days, data scientists aren’t that special, and I’m not that interested in further horizontal specialisation (e.g., as a machine learning engineer focused on edge applications). While horizontal specialisations are interesting from a technical perspective, my true interest is in nature-positive outcomes, i.e., cater for the psychographic market segment of “environmentalists”.2 This is consistent with career actions I’ve taken over the past decade, e.g., stopping my independent consulting / product building to join Car Next Door (now Uber Carshare) in 2016, founding a sustainability resource group as an Automattic employee in 2020, and leaving Automattic to focus more of my time on climate tech in 2021.\nRight now, my homepage says that “I provide independent consulting services around Data \u0026 AI, focusing on small-to-medium organisations in the climate tech and nature-positive sector”. However, my website tagline still says “Yanir Seroussi | Engineering Data Science \u0026 More” – changed last year from the long-standing “Data Science and Beyond”, but it’s still focused on my discipline rather than on the clients I aim to serve.3 Like many other independent consultants, I find it hard to commit to a niche that feels too narrow!\nBack to the podcast episode: To my relief, Stark noted that for those who choose psychographic specialisations, also specialising horizontally is less important. So going broad with Data \u0026 AI is both consistent with my experience (I’ve done work all over the stack), and makes sense when it comes to what I care about. For example, I don’t mind doing any web \u0026 data work for Reef Life Survey or engaging in full-stack data engineering for Work on Climate, as it delivers nature-positive outcomes in both cases. What I’m missing is higher profitability and a consistent client pipeline, but it seems like fully committing to the psychographic path is the right way to go.\nThe definitions were taken from Stark’s glossary, with slight tweaks. ↩︎\nIn my mind, the word “environmentalist” carries some negative connotations of aggressive and unrealistic activists, which is why I prefer the term “nature-positive” – focusing on the key outcome people in my niche want. ↩︎\nStark defines discipline as your craft, specialty, or job title. ↩︎\n","wordCount":"626","inLanguage":"en","datePublished":"2024-01-09T03:00:00Z","dateModified":"2024-01-09T13:23:28+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Psychographic specialisations may work for discipline generalists</h1><div class=post-meta><span title='2024-01-09 03:00:00 +0000 UTC'>January 9, 2024</span></div></header><div class=post-content><p><a href=https://thebusinessofauthority.com/ target=_blank rel=noopener>The Business of Authority</a> is a treasure trove of information for independent consultants. This morning, I listened to a 2019 episode titled <a href=https://thebusinessofauthority.com/episodes/five-ways-to-specialize target=_blank rel=noopener>Five Ways To Specialize</a>, where hosts Jonathan Stark and Rochelle Moulton went deep into these approaches:<sup id=fnref:1><a href=#fn:1 class=footnote-ref role=doc-noteref>1</a></sup></p><blockquote><ul><li><strong>Horizontal Specialization:</strong> Niching Down on a skill that can be applied to a very broad range of client types. e.g., responsive web design, iOS development, MySQL administration.</li><li><strong>Platform Specialization:</strong> A subset of Horizontal Specialization that targets a tool or platform that your ideal buyer was involved in choosing. Shopify, WordPress, or Salesforce are examples of Platform Specializations because your buyer almost certainly was involved in choosing the technology in question. Tools like Photoshop, JavaScript, or MySQL are not Platform Specializations because the buyer probably didn&rsquo;t directly or consciously choose them.</li><li><strong>Vertical Specialization:</strong> Niching Down on a market segment in which vendors offer goods and services specific to an industry, trade, profession, or other group of customers with specialized needs. Typical examples of buyers in a vertical market would be quick service restaurants, ski resorts, pet shelters, auto repair shops, and so on.</li><li><strong>Demographic Specialization:</strong> Niching Down on a market segment defined by personal attributes of an individual (e.g., Baby Boomers, New York residents, millionaires, Asian Americans, migraine sufferers).</li><li><strong>Psychographic Specialization:</strong> Niching Down on a market segment defined by personal beliefs, attitudes, or behaviors of an individual (e.g., environmentalists, skeptics, flat-earthers, dreamers).</li></ul></blockquote><p>I discovered the episode because I wanted to understand psychographic specialisations better, as part of figuring out my current positioning. In my first round as an independent consultant (back in 2014-2015), I specialised horizontally as a data scientist. These days, <a href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/>data scientists aren&rsquo;t that special</a>, and I&rsquo;m not <em>that</em> interested in further horizontal specialisation (e.g., as a machine learning engineer focused on edge applications). While horizontal specialisations are interesting from a technical perspective, my true interest is in nature-positive outcomes, i.e., cater for the psychographic market segment of &ldquo;environmentalists&rdquo;.<sup id=fnref:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup> This is consistent with career actions I&rsquo;ve taken over the past decade, e.g., <a href=https://yanirseroussi.com/2016/09/19/ask-why-finding-motives-causes-and-purpose-in-data-science/>stopping my independent consulting / product building to join Car Next Door (now Uber Carshare) in 2016</a>, <a href=https://wordpress.com/blog/2020/09/21/toward-zero-reducing-and-offsetting-our-data-center-power-emissions/ target=_blank rel=noopener>founding a sustainability resource group as an Automattic employee in 2020</a>, and <a href=https://yanirseroussi.com/2022/06/06/the-mission-matters-moving-to-climate-tech-as-a-data-scientist/>leaving Automattic to focus more of my time on climate tech in 2021</a>.</p><p>Right now, my homepage says that <em>&ldquo;I provide independent consulting services around Data & AI, focusing on small-to-medium organisations in the climate tech and nature-positive sector&rdquo;</em>. However, my website tagline still says <em>&ldquo;Yanir Seroussi | Engineering Data Science & More&rdquo;</em> – changed last year from the long-standing <em>&ldquo;Data Science and Beyond&rdquo;</em>, but it&rsquo;s still focused on my discipline rather than on the clients I aim to serve.<sup id=fnref:3><a href=#fn:3 class=footnote-ref role=doc-noteref>3</a></sup> Like many other independent consultants, I find it hard to commit to a niche that feels too narrow!</p><p>Back to the podcast episode: To my relief, Stark noted that for those who choose psychographic specialisations, also specialising horizontally is less important. So going broad with Data & AI is both consistent with my experience (I&rsquo;ve done work all over the stack), and makes sense when it comes to what I care about. For example, I don&rsquo;t mind <a href=https://yanirseroussi.com/2023/11/29/supporting-volunteer-monitoring-of-marine-biodiversity-with-modern-web-and-data-tools/>doing any web & data work for Reef Life Survey</a> or engaging in full-stack data engineering for Work on Climate, as it delivers nature-positive outcomes in both cases. What I&rsquo;m missing is higher profitability and a consistent client pipeline, but it seems like fully committing to the psychographic path is the right way to go.</p><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>The definitions were taken from <a href=https://jonathanstark.com/glossary target=_blank rel=noopener>Stark&rsquo;s glossary</a>, with slight tweaks.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:2><p>In my mind, the word &ldquo;environmentalist&rdquo; carries some negative connotations of aggressive and unrealistic activists, which is why I prefer the term &ldquo;nature-positive&rdquo; – focusing on the key outcome people in my niche want.&#160;<a href=#fnref:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:3><p><a href=https://jonathanstark.com/glossary#Discipline target=_blank rel=noopener>Stark defines discipline</a> as <em>your craft, specialty, or job title</em>.&#160;<a href=#fnref:3 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/environment/>Environment</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on x" href="https://x.com/intent/tweet/?text=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f&amp;hashtags=business%2ccareer%2cenvironment%2cmarketing%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f&amp;title=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists&amp;summary=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f&title=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on whatsapp" href="https://api.whatsapp.com/send?text=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on telegram" href="https://telegram.me/share/url?text=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Psychographic specialisations may work for discipline generalists on ycombinator" href="https://news.ycombinator.com/submitlink?t=Psychographic%20specialisations%20may%20work%20for%20discipline%20generalists&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f01%2f09%2fpsychographic-specialisations-may-work-for-discipline-generalists%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/02/06/future-software-development-may-require-fewer-humans/index.html b/til/2024/02/06/future-software-development-may-require-fewer-humans/index.html
index 925bf3922..add393aa1 100644
--- a/til/2024/02/06/future-software-development-may-require-fewer-humans/index.html
+++ b/til/2024/02/06/future-software-development-may-require-fewer-humans/index.html
@@ -4,7 +4,7 @@
 Well, we&rsquo;re introducing a new vehicle type. So still a large language model with applications that are built on top of it, but it&rsquo;s a new vehicle type, and it&rsquo;s the Ford F-150. And it is specifically built for those environments and those orientations and those jobs.</p></blockquote><blockquote><p>From a model perspective, there&rsquo;s a massive, massive difference between us and others in that, one, something that&rsquo;s tuned for software will know more about software. [&mldr;] OpenAI has made famous reinforcement learning via human feedback. Anthropic has made famous reinforcement learning via constitutional AI or algorithmic AI and things of that nature. We&rsquo;re introducing something that we call reinforcement learning via code execution feedback. So taking advantage of the aspects of software and the aspects of code that you might imagine. One, it&rsquo;s inspectable, two, it&rsquo;s runnable, and three, you have to compile or execute these things and you can get deterministic feedback. And so what we have done is inside of our training set, we&rsquo;ve made very, very, very, very different decisions than general purpose models would make.</p><p>And this goes to the heart of why a truck is different than a sedan, we&rsquo;ve made very different design decisions. We have included only high-quality code in the model. We have cut data sources out of the initial dataset because it&rsquo;s a very different audience that&rsquo;s going to use this for a very different purpose. And so this goes to the reinforcement learning side of the fence, too. We care very deeply that yeah, we&rsquo;ll have human feedback as well, but the reinforcement learning is going to be all about what&rsquo;s produced from the software side. So we&rsquo;ve taken about 50,000 high quality real-world projects out of the initial training dataset and are using it on the reinforcement side. We&rsquo;ve made all the git commits executable. We have all three legs of the stool that we need. We&rsquo;ve got the issue that&rsquo;s described in real language. We&rsquo;ve got the code, we&rsquo;ve got tests, we&rsquo;ve got all of the things that we need, and we send this through a reinforcement learning platform on our model to see what it does.</p></blockquote><p>I find the idea of reinforcement learning from code execution feedback very compelling. Given how well GPT-4 already performs, it makes sense that you could get much better results by following poolside&rsquo;s approach.</p><p>As with other AI advancements, the implications on society are likely to be profound. It&rsquo;s not hard to imagine a situation where software development becomes much less labour-intensive than it is today, leading to unemployment among software developers. Or it could be that access to better AI software developers would lead to much more software getting built. Who knows?</p><p>My bet is on the future of software development being quite different from today, and on things changing more rapidly than most people realise. Skills that are valuable today may be obsolete within 5-10 years. Rephrasing poolside&rsquo;s website, we&rsquo;re likely to switch from human-led-AI-assisted development to AI-led-human-assisted development.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on x" href="https://x.com/intent/tweet/?text=Future%20software%20development%20may%20require%20fewer%20humans&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f&amp;hashtags=artificialintelligence%2cquotes%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f&amp;title=Future%20software%20development%20may%20require%20fewer%20humans&amp;summary=Future%20software%20development%20may%20require%20fewer%20humans&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f&title=Future%20software%20development%20may%20require%20fewer%20humans"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on whatsapp" href="https://api.whatsapp.com/send?text=Future%20software%20development%20may%20require%20fewer%20humans%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on telegram" href="https://telegram.me/share/url?text=Future%20software%20development%20may%20require%20fewer%20humans&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Future software development may require fewer humans on ycombinator" href="https://news.ycombinator.com/submitlink?t=Future%20software%20development%20may%20require%20fewer%20humans&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f06%2ffuture-software-development-may-require-fewer-humans%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/index.html b/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/index.html
index 0b5eb88d9..228b26efb 100644
--- a/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/index.html
+++ b/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,personal"><meta name=description content="Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The three Cs of indie consulting: Confidence, Cash, and Connections"><meta property="og:description" content="Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-02-17T02:00:00+00:00"><meta property="article:modified_time" content="2024-02-17T12:34:00+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The three Cs of indie consulting: Confidence, Cash, and Connections"><meta name=twitter:description content="Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The three Cs of indie consulting: Confidence, Cash, and Connections","item":"https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The three Cs of indie consulting: Confidence, Cash, and Connections","name":"The three Cs of indie consulting: Confidence, Cash, and Connections","description":"Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting.","keywords":["business","career","personal"],"articleBody":"Since realising that the lines between solo consulting and product building are blurry, I’ve been consuming a lot of Jonathan Stark’s materials on positioning, value pricing, and related topics. I mostly read and listen to his content, but this short video hits the nail on the head:\nThe quick summary is that you should have the following before you quit your job to become a solo consultant:\nConfidence that you can deliver immediate value to your clients. Cash to get you through the tough times. Connections to ideal buyers in your target market. When I quit my last job, my plan was to take some time off, and then look at building a solo product business. It took me a while to consider consulting as a serious pursuit – I was definitely not thinking of things like ideal buyers in my target market, let alone how to connect to them and maintain relationships over time. I do have the confidence, and I’m not strapped for cash, but getting the positioning right and building connections is a long-term play. Anyone who’s quitting with the intention of consulting a soloist should definitely keep the three Cs in mind!\n","wordCount":"196","inLanguage":"en","datePublished":"2024-02-17T02:00:00Z","dateModified":"2024-02-17T12:34:00+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The three Cs of indie consulting: Confidence, Cash, and Connections</h1><div class=post-meta><span title='2024-02-17 02:00:00 +0000 UTC'>February 17, 2024</span></div></header><div class=post-content><p>Since realising that <a href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/>the lines between solo consulting and product building are blurry</a>, I&rsquo;ve been consuming a lot of <a href=https://jonathanstark.com/ target=_blank rel=noopener>Jonathan Stark&rsquo;s</a> materials on positioning, value pricing, and related topics. I mostly read and listen to his content, but this short video hits the nail on the head:</p><p><div style=position:relative;padding-bottom:56.25%;height:0;overflow:hidden><iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen loading=eager referrerpolicy=strict-origin-when-cross-origin src="https://www.youtube.com/embed/8pup0vBD0zI?autoplay=0&controls=1&end=0&loop=0&mute=0&start=0" style=position:absolute;top:0;left:0;width:100%;height:100%;border:0 title="YouTube video"></iframe></div></p><p>The quick summary is that you should have the following before you quit your job to become a solo consultant:</p><ol><li><strong>Confidence</strong> that you can deliver immediate value to your clients.</li><li><strong>Cash</strong> to get you through the tough times.</li><li><strong>Connections</strong> to ideal buyers in your target market.</li></ol><p>When I quit my last job, my plan was to take some time off, and then look at building a solo product business. It took me a while to consider consulting as a serious pursuit – I was definitely not thinking of things like <em>ideal buyers in my target market</em>, let alone how to connect to them and maintain relationships over time. I do have the confidence, and I&rsquo;m not strapped for cash, but getting the positioning right and building connections is a long-term play. Anyone who&rsquo;s quitting with the intention of consulting a soloist should definitely keep the three Cs in mind!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on x" href="https://x.com/intent/tweet/?text=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f&amp;hashtags=business%2ccareer%2cpersonal"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f&amp;title=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections&amp;summary=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f&title=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on whatsapp" href="https://api.whatsapp.com/send?text=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on telegram" href="https://telegram.me/share/url?text=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The three Cs of indie consulting: Confidence, Cash, and Connections on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20three%20Cs%20of%20indie%20consulting%3a%20Confidence%2c%20Cash%2c%20and%20Connections&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f02%2f17%2fthe-three-cs-of-indie-consulting-confidence-cash-and-connections%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/index.html b/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/index.html
index 2a954b835..f924379da 100644
--- a/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/index.html
+++ b/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,career,personal,productivity"><meta name=description content="I put the book to use after the first listen, and will definitely revisit it in the future to form better habits."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Atomic Habits is full of actionable advice"><meta property="og:description" content="I put the book to use after the first listen, and will definitely revisit it in the future to form better habits."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-03-12T06:19:31+00:00"><meta property="article:modified_time" content="2024-03-12T16:33:48+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Atomic Habits is full of actionable advice"><meta name=twitter:description content="I put the book to use after the first listen, and will definitely revisit it in the future to form better habits."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Atomic Habits is full of actionable advice","item":"https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Atomic Habits is full of actionable advice","name":"Atomic Habits is full of actionable advice","description":"I put the book to use after the first listen, and will definitely revisit it in the future to form better habits.","keywords":["books","career","personal","productivity"],"articleBody":"I recently picked up the Audible version of Atomic Habits, mostly because I kept seeing it everywhere. I wasn’t disappointed. It immediately gave me ideas for improving my routines based on a cursory listen. I will definitely revisit it and refer to resources like the habit cheat sheet in the future.\nAs a test, I used several tips to form better habits around sending messages (which is often a cause for procrastination):\nHabit stacking: Send one message after my morning workout (chained to an existing positive habit). Temptation bundling and reinforcement: After the message is out, have a snack. Tracking: Tick a box once the message is sent, which adds up to a lovely streak. …and it worked! It feels a bit odd consciously training myself like this, but it’s better than the alternative of sticking to bad habits and procrastination.\n","wordCount":"141","inLanguage":"en","datePublished":"2024-03-12T06:19:31Z","dateModified":"2024-03-12T16:33:48+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Atomic Habits is full of actionable advice</h1><div class=post-meta><span title='2024-03-12 06:19:31 +0000 UTC'>March 12, 2024</span></div></header><div class=post-content><p>I recently picked up the Audible version of <a href=https://jamesclear.com/atomic-habits target=_blank rel=noopener>Atomic Habits</a>, mostly because I kept seeing it everywhere. I wasn&rsquo;t disappointed. It immediately gave me ideas for improving my routines based on a cursory listen. I will definitely revisit it and refer to resources like <a href=https://s3.amazonaws.com/jamesclear/Atomic+Habits/Habits+Cheat+Sheet.pdf target=_blank rel=noopener>the habit cheat sheet</a> in the future.</p><p>As a test, I used several tips to form better habits around sending messages (which is often a cause for procrastination):</p><ul><li>Habit stacking: Send one message after my morning workout (chained to an existing positive habit).</li><li>Temptation bundling and reinforcement: After the message is out, have a snack.</li><li>Tracking: Tick a box once the message is sent, which adds up to a lovely streak.</li></ul><p>&mldr;and it worked! It feels a bit odd consciously training myself like this, but it&rsquo;s better than the alternative of sticking to bad habits and procrastination.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/personal/>Personal</a></li><li><a href=https://yanirseroussi.com/tags/productivity/>Productivity</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on x" href="https://x.com/intent/tweet/?text=Atomic%20Habits%20is%20full%20of%20actionable%20advice&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f&amp;hashtags=books%2ccareer%2cpersonal%2cproductivity"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f&amp;title=Atomic%20Habits%20is%20full%20of%20actionable%20advice&amp;summary=Atomic%20Habits%20is%20full%20of%20actionable%20advice&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f&title=Atomic%20Habits%20is%20full%20of%20actionable%20advice"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on whatsapp" href="https://api.whatsapp.com/send?text=Atomic%20Habits%20is%20full%20of%20actionable%20advice%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on telegram" href="https://telegram.me/share/url?text=Atomic%20Habits%20is%20full%20of%20actionable%20advice&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Atomic Habits is full of actionable advice on ycombinator" href="https://news.ycombinator.com/submitlink?t=Atomic%20Habits%20is%20full%20of%20actionable%20advice&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f03%2f12%2fatomic-habits-is-full-of-actionable-advice%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/index.html b/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/index.html
index 0d0a431cf..a4bad069f 100644
--- a/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/index.html
+++ b/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,career,data engineering,quotes"><meta name=description content="My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The data engineering lifecycle is not going anywhere"><meta property="og:description" content="My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-04-05T01:00:00+00:00"><meta property="article:modified_time" content="2024-04-05T11:23:38+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The data engineering lifecycle is not going anywhere"><meta name=twitter:description content="My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The data engineering lifecycle is not going anywhere","item":"https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The data engineering lifecycle is not going anywhere","name":"The data engineering lifecycle is not going anywhere","description":"My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley.","keywords":["books","career","data engineering","quotes"],"articleBody":"After over a decade of engaging in reluctant data engineering, I’ve come to terms with it being a part of my life for as long as I’m going to be working in Data \u0026 AI. As the rise of data engineering happened concurrently with my career in data, I picked up some data engineering skills to solve specific problems. That is, I learned from experience rather than from a formal education program. Recently, I read the book Fundamentals of Data Engineering by Joe Reis and Matt Housley, which has helped consolidate my understanding of the field – including key principles and trends.\nFundamentals of Data Engineering is an essential read for anyone working in data. As it aims to remain relevant for years to come, it doesn’t get into the details of specific tools, which are bound to keep changing. Instead, the book describes the data engineering lifecycle, provides guidelines for data architecture and technology choice, and then goes into details on each of the lifecycle stages.\nI will not attempt to summarise the book here, as it is itself a summary of a vast field. However, a couple of key items worth pulling out are the data engineering lifecycle and the principles of good data architecture (which apply to other software systems as well). In a world awash with vendor marketing and AI hype, it’s important to keep these abstractions in mind. In particular, the data engineering lifecycle is unlikely to go anywhere, even if more of the work will be done by AIs (aka automated agents).\nFive stages of the data engineering lifecycle:\nGeneration (typically outside the control of data engineers) Storage (underlies stages 3-5 – typically managed by data engineers) Ingestion (typically getting data from external production systems to a central warehouse/lake/lakehouse) Transformation (turning raw ingested data into more useful datasets) Serving (for analytics, machine learning, and reverse ETL – the latter means sending data back to production systems) Undercurrents of the data engineering lifecycle (which manifest in every stage):\nSecurity Data management DataOps Data architecture Orchestration Software engineering Principles of good data architecture:\nChoose common components wisely Plan for failure Architect for scalability Architecture is leadership Always be architecting Build loosely coupled systems Make reversible decisions Prioritize security Embrace FinOps ","wordCount":"374","inLanguage":"en","datePublished":"2024-04-05T01:00:00Z","dateModified":"2024-04-05T11:23:38+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The data engineering lifecycle is not going anywhere</h1><div class=post-meta><span title='2024-04-05 01:00:00 +0000 UTC'>April 5, 2024</span></div></header><div class=post-content><p>After over a decade of engaging in <a href=https://yanirseroussi.com/2023/10/25/lessons-from-reluctant-data-engineering/>reluctant data engineering</a>, I&rsquo;ve come to terms with it being a part of my life for as long as I&rsquo;m going to be working in Data & AI. As the rise of data engineering happened concurrently with my career in data, I picked up some data engineering skills to solve specific problems. That is, I learned from experience rather than from a formal education program. Recently, I read the book <a href=https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/ target=_blank rel=noopener>Fundamentals of Data Engineering</a> by Joe Reis and Matt Housley, which has helped consolidate my understanding of the field – including key principles and trends.</p><p>Fundamentals of Data Engineering is an essential read for anyone working in data. As it aims to remain relevant for years to come, it doesn&rsquo;t get into the details of specific tools, which are bound to keep changing. Instead, the book describes the data engineering lifecycle, provides guidelines for data architecture and technology choice, and then goes into details on each of the lifecycle stages.</p><p>I will not attempt to summarise the book here, as it is itself a summary of a vast field. However, a couple of key items worth pulling out are the data engineering lifecycle and the principles of good data architecture (which apply to other software systems as well). In a world awash with vendor marketing and AI hype, it&rsquo;s important to keep these abstractions in mind. In particular, the data engineering lifecycle is unlikely to go anywhere, even if more of the work will be done by AIs (aka <a href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/>automated agents</a>).</p><blockquote><p><strong>Five stages of the data engineering lifecycle:</strong></p><ol><li>Generation (typically outside the control of data engineers)</li><li>Storage (underlies stages 3-5 – typically managed by data engineers)</li><li>Ingestion (typically getting data from external production systems to a central warehouse/lake/lakehouse)</li><li>Transformation (turning raw ingested data into more useful datasets)</li><li>Serving (for analytics, machine learning, and reverse ETL – the latter means sending data back to production systems)</li></ol><p><strong>Undercurrents of the data engineering lifecycle (which manifest in every stage):</strong></p><ul><li>Security</li><li>Data management</li><li>DataOps</li><li>Data architecture</li><li>Orchestration</li><li>Software engineering</li></ul><p><strong>Principles of good data architecture:</strong></p><ol><li>Choose common components wisely</li><li>Plan for failure</li><li>Architect for scalability</li><li>Architecture is leadership</li><li>Always be architecting</li><li>Build loosely coupled systems</li><li>Make reversible decisions</li><li>Prioritize security</li><li>Embrace FinOps</li></ol></blockquote></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/data-engineering/>Data Engineering</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on x" href="https://x.com/intent/tweet/?text=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f&amp;hashtags=books%2ccareer%2cdataengineering%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f&amp;title=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere&amp;summary=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f&title=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on whatsapp" href="https://api.whatsapp.com/send?text=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on telegram" href="https://telegram.me/share/url?text=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The data engineering lifecycle is not going anywhere on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20data%20engineering%20lifecycle%20is%20not%20going%20anywhere&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f05%2fthe-data-engineering-lifecycle-is-not-going-anywhere%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/04/11/linkedin-is-a-teachable-skill/index.html b/til/2024/04/11/linkedin-is-a-teachable-skill/index.html
index edc6a3429..06db37ddd 100644
--- a/til/2024/04/11/linkedin-is-a-teachable-skill/index.html
+++ b/til/2024/04/11/linkedin-is-a-teachable-skill/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,LinkedIn,marketing"><meta name=description content="An high-level overview of things I learned from Justin Welsh&rsquo;s LinkedIn Operating System course."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="LinkedIn is a teachable skill"><meta property="og:description" content="An high-level overview of things I learned from Justin Welsh&rsquo;s LinkedIn Operating System course."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-04-11T01:45:25+00:00"><meta property="article:modified_time" content="2024-04-11T13:42:58+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="LinkedIn is a teachable skill"><meta name=twitter:description content="An high-level overview of things I learned from Justin Welsh&rsquo;s LinkedIn Operating System course."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"LinkedIn is a teachable skill","item":"https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"LinkedIn is a teachable skill","name":"LinkedIn is a teachable skill","description":"An high-level overview of things I learned from Justin Welsh\u0026rsquo;s LinkedIn Operating System course.","keywords":["business","career","LinkedIn","marketing"],"articleBody":"As I delve deeper into the world of solopreneurship, I’m learning many valuable lessons about running a solo business. Examples include:\nIt is possible to generate a sustainable income as a soloist, even if you’re not building a software product. Specialisation and differentiation are key to success (e.g., commodity data scientists are undifferentiated). It’s essential to cultivate 1:1 relationships and connections. Scaling trust through parasocial relationships is how many soloists create a steady pipeline of inbound leads (and it makes outreach easier). Good habits, systems, and processes increase the effectiveness of any skill. One person who exemplifies all of the above lessons is Justin Welsh, who is on the path to $10M in revenue through his soloist ventures. I came across Justin via the Fractionals United community, where many members are interested in getting better at LinkedIn.\nNow, it may sound weird to be talking about being “good at LinkedIn”, but I think it describes Justin well – just look at his feed and profile.\nThe beauty of being good at LinkedIn is that it’s a skill that’s highly visible. And unlike other people who merely possess LinkedIn skills, Justin has taken the next step to teach it to others. He offers an online course called the LinkedIn Operating System,1 which I just completed.\nInterestingly, despite being somewhat frugal in general, it didn’t take me long to decide to pay for the course. Justin’s visible skills and the course’s testimonials were helpful, and a 30% discount was helpful in getting me over the line (seems like the discounts change constantly). However, I was already a warm prospect because of my realisation that parasocial relationships are important, and because I already post on LinkedIn (I find it much more civil and valuable than other social networks). Therefore, I see getting better at LinkedIn as an investment in my business. Along with posting regularly on my website and mailing list, improving my LinkedIn visibility should help with attracting quality leads.\nI’m still far from implementing all of Justin’s advice (some of which would take months and years of deliberate practice), but here’s a quick outline of the core parts of the course along with my high-level thoughts.\nThe foundation: Defining your niche, and building an online persona through your backstory, polarisation, and stories. As noted above, having a well-defined niche is key. I’m still working on sharpening mine, but I’m definitely further along than where I was before I fully recognised the need for specialisation. I’m not a fan of online polarisation (one of the reasons I deleted my Twitter account), so I translated polarisation to strong opinions, which is something I definitely have. As a CEO at a startup I worked with once said: “you have the rare qualities of a) very good with data and b) very able to express your opinions – especially when you don’t agree with how something is being done.” In fact, the realisation that as a soloist I don’t have to hide my opinions to fit in with the norms of a single company is quite liberating. Content creation: Relaying content through leading/discovering/reporting, and systematising content creation and publication. It’s helpful to think of content as falling into the three broad categories of leading, discovering, and reporting. In some areas, I’m comfortable leading based on my experience. In others, I’m happy to discover new things and share them as I go. Reporting is something I’d like to do more of – perhaps in the form of podcast interviews down the track. Again, systems and habits are key to success. Even people who have a natural talent for LinkedIn content creation won’t get anywhere if they don’t post consistently. Building your tribe: Smart audience interaction and finding relevant people. This part is also about systems that are fairly straightforward to implement. That’s partly why LinkedIn is a teachable skill: Most people don’t bother systematising audience interaction and growth. At its core, this is about human relationships (which I believe will take longer to automate than other skills that will be taken over by AI). LinkedIn lead capture: Building the profile funnel and hero section, telling prospects what you do, providing social proof, and ending with a call to action. This includes parts that I already knew I wasn’t doing great (like having a generic cover photo). Now I’m more motivated to improve my profile, and have a checklist of items to address. It’s surprising how many people have profiles that can be easily improved – including many accomplished professionals. This highlights how LinkedIn profile building is a skillset that’s orthogonal to one’s technical career skills. Business workflow: Inbound strategy, outbound strategy, and selling on LinkedIn. Once again, full of helpful tips and actionable advice that I will be implementing. Like much of the better business and marketing advice out there, the emphasis is on building trust and relationships and providing value, rather than on creepy sales tactics. Doing this well takes time and a conscious effort. If you’re in a place where you’d benefit from learning how to LinkedIn, definitely check out the course. Even though it’s a few years old at this point, it’s still highly relevant and recommended. I will be revisiting it in the future.\nThat’s an affiliate link to the course, so I get a cut if you sign up through the link. It’s probably marketing 101 to incentivise people to generate referrals, but I wouldn’t be publishing this post if I didn’t think the course was valuable. ↩︎\n","wordCount":"916","inLanguage":"en","datePublished":"2024-04-11T01:45:25Z","dateModified":"2024-04-11T13:42:58+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">LinkedIn is a teachable skill</h1><div class=post-meta><span title='2024-04-11 01:45:25 +0000 UTC'>April 11, 2024</span></div></header><div class=post-content><p>As I delve deeper into the world of solopreneurship, I&rsquo;m learning many valuable lessons about running a solo business. Examples include:</p><ul><li>It is possible to generate a sustainable income as a soloist, even if you&rsquo;re not <a href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/>building a software product</a>.</li><li><a href=https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/>Specialisation</a> and differentiation are key to success (e.g., <a href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/>commodity data scientists</a> are undifferentiated).</li><li>It&rsquo;s essential to cultivate 1:1 relationships and <a href=https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/>connections</a>.</li><li><a href=https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/>Scaling trust through parasocial relationships</a> is how many soloists create a steady pipeline of inbound leads (and it makes outreach easier).</li><li><a href=https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/>Good habits</a>, systems, and processes increase the effectiveness of any skill.</li></ul><p>One person who exemplifies all of the above lessons is <a href=https://www.justinwelsh.me/ target=_blank rel=noopener>Justin Welsh</a>, who is on the path to $10M in revenue through his soloist ventures. I came across Justin via the <a href=https://www.fractionalsunited.com/ target=_blank rel=noopener>Fractionals United community</a>, where many members are interested in getting better at LinkedIn.</p><p>Now, it may sound weird to be talking about being <em>&ldquo;good at LinkedIn&rdquo;</em>, but I think it describes Justin well – just look at his <a href=https://www.linkedin.com/in/justinwelsh/recent-activity/all/ target=_blank rel=noopener>feed</a> and <a href=https://www.linkedin.com/in/justinwelsh/ target=_blank rel=noopener>profile</a>.</p><p>The beauty of being good at LinkedIn is that it&rsquo;s a skill that&rsquo;s highly visible. And unlike other people who merely possess LinkedIn skills, Justin has taken the next step to teach it to others. He offers <a href=https://learn.justinwelsh.me/a/2147505019/fPm7F4Xu target=_blank rel=noopener>an online course called the LinkedIn Operating System</a>,<sup id=fnref:1><a href=#fn:1 class=footnote-ref role=doc-noteref>1</a></sup> which I just completed.</p><p>Interestingly, despite being somewhat frugal in general, it didn&rsquo;t take me long to decide to pay for the course. Justin&rsquo;s visible skills and the course&rsquo;s testimonials were helpful, and a 30% discount was helpful in getting me over the line (seems like the discounts change constantly). However, I was already a warm prospect because of my realisation that parasocial relationships are important, and because I already post on LinkedIn (I find it much more civil and valuable than other social networks). Therefore, I see getting better at LinkedIn as an investment in my business. Along with <a href=https://yanirseroussi.com/2024/01/19/new-decade-new-tagline-data-and-ai-for-impact/>posting regularly on my website and mailing list</a>, improving my LinkedIn visibility should help with attracting quality leads.</p><p>I&rsquo;m still far from implementing all of Justin&rsquo;s advice (some of which would take months and years of deliberate practice), but here&rsquo;s a quick outline of the core parts of the course along with my high-level thoughts.</p><ol><li><strong>The foundation:</strong> Defining your niche, and building an online persona through your backstory, polarisation, and stories.<ul><li>As noted above, having a well-defined niche is key. I&rsquo;m still working on sharpening mine, but I&rsquo;m definitely further along than where I was before I fully recognised the need for specialisation.</li><li>I&rsquo;m not a fan of online polarisation (one of the reasons I deleted my Twitter account), so I translated <em>polarisation</em> to <em>strong opinions</em>, which is something I definitely have. As a CEO at a startup I worked with once said: <em>&ldquo;you have the rare qualities of a) very good with data and b) very able to express your opinions – especially when you don&rsquo;t agree with how something is being done.&rdquo;</em> In fact, the realisation that as a soloist I don&rsquo;t have to hide my opinions to fit in with the norms of a single company is quite liberating.</li></ul></li><li><strong>Content creation:</strong> Relaying content through leading/discovering/reporting, and systematising content creation and publication.<ul><li>It&rsquo;s helpful to think of content as falling into the three broad categories of leading, discovering, and reporting. In some areas, I&rsquo;m comfortable leading based on my experience. In others, I&rsquo;m happy to discover new things and share them as I go. Reporting is something I&rsquo;d like to do more of – perhaps in the form of podcast interviews down the track.</li><li>Again, systems and habits are key to success. Even people who have a natural talent for LinkedIn content creation won&rsquo;t get anywhere if they don&rsquo;t post consistently.</li></ul></li><li><strong>Building your tribe:</strong> Smart audience interaction and finding relevant people.<ul><li>This part is also about systems that are fairly straightforward to implement. That&rsquo;s partly why LinkedIn is a teachable skill: Most people don&rsquo;t bother systematising audience interaction and growth. At its core, this is about human relationships (which I believe will take longer to automate than <a href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/>other skills that will be taken over by AI</a>).</li></ul></li><li><strong>LinkedIn lead capture:</strong> Building the profile funnel and hero section, telling prospects what you do, providing social proof, and ending with a call to action.<ul><li>This includes parts that I already knew I wasn&rsquo;t doing great (like having a generic cover photo). Now I&rsquo;m more motivated to improve my profile, and have a checklist of items to address. It&rsquo;s surprising how many people have profiles that can be easily improved – including many accomplished professionals. This highlights how LinkedIn profile building is a skillset that&rsquo;s orthogonal to one&rsquo;s technical career skills.</li></ul></li><li><strong>Business workflow:</strong> Inbound strategy, outbound strategy, and selling on LinkedIn.<ul><li>Once again, full of helpful tips and actionable advice that I will be implementing. Like much of the better business and marketing advice out there, the emphasis is on building trust and relationships and providing value, rather than on creepy sales tactics. Doing this well takes time and a conscious effort.</li></ul></li></ol><p>If you&rsquo;re in a place where you&rsquo;d benefit from learning how to LinkedIn, definitely check out <a href=https://learn.justinwelsh.me/a/2147505019/fPm7F4Xu target=_blank rel=noopener>the course</a>. Even though it&rsquo;s a few years old at this point, it&rsquo;s still highly relevant and recommended. I will be revisiting it in the future.</p><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>That&rsquo;s an affiliate link to the course, so I get a cut if you sign up through the link. It&rsquo;s probably marketing 101 to incentivise people to generate referrals, but I wouldn&rsquo;t be publishing this post if I didn&rsquo;t think the course was valuable.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/linkedin/>LinkedIn</a></li><li><a href=https://yanirseroussi.com/tags/marketing/>Marketing</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on x" href="https://x.com/intent/tweet/?text=LinkedIn%20is%20a%20teachable%20skill&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f&amp;hashtags=business%2ccareer%2cLinkedIn%2cmarketing"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f&amp;title=LinkedIn%20is%20a%20teachable%20skill&amp;summary=LinkedIn%20is%20a%20teachable%20skill&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f&title=LinkedIn%20is%20a%20teachable%20skill"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on whatsapp" href="https://api.whatsapp.com/send?text=LinkedIn%20is%20a%20teachable%20skill%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on telegram" href="https://telegram.me/share/url?text=LinkedIn%20is%20a%20teachable%20skill&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share LinkedIn is a teachable skill on ycombinator" href="https://news.ycombinator.com/submitlink?t=LinkedIn%20is%20a%20teachable%20skill&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f04%2f11%2flinkedin-is-a-teachable-skill%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/05/25/adapting-to-the-economy-of-algorithms/index.html b/til/2024/05/25/adapting-to-the-economy-of-algorithms/index.html
index 88670587f..d25bb6ab4 100644
--- a/til/2024/05/25/adapting-to-the-economy-of-algorithms/index.html
+++ b/til/2024/05/25/adapting-to-the-economy-of-algorithms/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,books,business,career,futurism,quotes"><meta name=description content="Overview of the book The Economy of Algorithms by Marek Kowalkiewicz."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Adapting to the economy of algorithms"><meta property="og:description" content="Overview of the book The Economy of Algorithms by Marek Kowalkiewicz."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-05-25T00:00:00+00:00"><meta property="article:modified_time" content="2024-05-25T10:00:56+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Adapting to the economy of algorithms"><meta name=twitter:description content="Overview of the book The Economy of Algorithms by Marek Kowalkiewicz."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Adapting to the economy of algorithms","item":"https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Adapting to the economy of algorithms","name":"Adapting to the economy of algorithms","description":"Overview of the book The Economy of Algorithms by Marek Kowalkiewicz.","keywords":["artificial intelligence","books","business","career","futurism","quotes"],"articleBody":"I recently read The Economy of Algorithms: AI and the Rise of the Digital Minions, by Marek Kowalkiewicz. It’s a light read that is mostly aimed at business leaders who are looking to adapt to our current age of increasing algorithmic automation (aka AI). However, some of the points are relevant to any human – especially to knowledge workers.\nThe main message of the book is that we’re experiencing an economic transition between three economies:\nEconomy of Corporations: What we had in the 20th century, with corporations being the most powerful entities. Economy of People: Emerged in the early 21st century, with individuals gaining agency that has enabled us to compete with corporations (e.g., with YouTube influencers becoming more popular than some TV channels). Economy of Algorithms: Starting in recent years, algorithms have been gaining agency and competing with people and corporations. The emergence of a new economy doesn’t immediately obviate the previous economies, as Marek explains:\nIn the economy of algorithms, corporations can leverage advanced algorithms to optimise their operations, enhance decision-making and drive innovation, which increases their competitiveness and profitability. Individuals can harness the power of algorithms to develop new skills, create innovative products or services and participate in emerging markets that were once inaccessible to them. And, as you’ll soon find out, algorithms themselves – without active human or corporate control – can buy, sell, collect and invest funds, and perform other activities that were previously reserved for businesses and people.\nThe book includes a rich set of examples and stories that illustrate the above points. As everyone is wondering what to do in this period of accelerating change, Marek proposes nine rules for our age. These are grouped under three areas:\nBe the minion master: Automate revenue generation Automate relentlessly but mindfully Build an army of digital minions Empower your people Be relentlessly curious: Evolve continuously Launch new value propositions Start a digital evolution Stay curious Be boldly optimistic: Saturate relationships with customers Maximise customer value Build digital ecosystems Create a bold future Marek calls corporations that do this well RACERS, which stands for Revenue Automation, Continuous Evolution, and Relationship Saturation.\nWhile some rules only apply to business leaders, a few also apply to humans who are trying to remain economically relevant – especially the rules around embracing automation and evolving continuously. The alternative is to be made redundant. For example, I recently saw a job ad for a copywriter position that required both the ability to produce excellent copy without the aid of AI, and also the ability to produce excellent copy four times faster with AI tools. The same applies – explicitly or implicitly – to any knowledge work that can now be sped up with automation.\n","wordCount":"453","inLanguage":"en","datePublished":"2024-05-25T00:00:00Z","dateModified":"2024-05-25T10:00:56+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Adapting to the economy of algorithms</h1><div class=post-meta><span title='2024-05-25 00:00:00 +0000 UTC'>May 25, 2024</span></div></header><div class=post-content><p>I recently read <a href=https://www.blackincbooks.com.au/books/economy-algorithms target=_blank rel=noopener>The Economy of Algorithms: AI and the Rise of the Digital Minions, by Marek Kowalkiewicz</a>. It&rsquo;s a light read that is mostly aimed at business leaders who are looking to adapt to our current age of <a href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/>increasing algorithmic automation (aka AI)</a>. However, some of the points are relevant to any human – especially to knowledge workers.</p><p>The main message of the book is that we&rsquo;re experiencing an economic transition between three economies:</p><ol><li><strong>Economy of Corporations</strong>: What we had in the 20th century, with corporations being the most powerful entities.</li><li><strong>Economy of People</strong>: Emerged in the early 21st century, with individuals gaining agency that has enabled us to compete with corporations (e.g., with YouTube influencers becoming more popular than some TV channels).</li><li><strong>Economy of Algorithms</strong>: Starting in recent years, algorithms have been gaining agency and competing with people and corporations.</li></ol><p>The emergence of a new economy doesn&rsquo;t immediately obviate the previous economies, as Marek explains:</p><blockquote><p>In the economy of algorithms, corporations can leverage advanced algorithms to optimise their operations, enhance decision-making and drive innovation, which increases their competitiveness and profitability. Individuals can harness the power of algorithms to develop new skills, create innovative products or services and participate in emerging markets that were once inaccessible to them. And, as you&rsquo;ll soon find out, algorithms themselves – without active human or corporate control – can buy, sell, collect and invest funds, and perform other activities that were previously reserved for businesses and people.</p></blockquote><p>The book includes a rich set of examples and stories that illustrate the above points. As everyone is wondering what to do in this period of accelerating change, Marek proposes nine rules for our age. These are grouped under three areas:</p><ol><li><strong>Be the minion master: Automate revenue generation</strong><ul><li>Automate relentlessly but mindfully</li><li>Build an army of digital minions</li><li>Empower your people</li></ul></li><li><strong>Be relentlessly curious: Evolve continuously</strong><ul><li>Launch new value propositions</li><li>Start a digital evolution</li><li>Stay curious</li></ul></li><li><strong>Be boldly optimistic: Saturate relationships with customers</strong><ul><li>Maximise customer value</li><li>Build digital ecosystems</li><li>Create a bold future</li></ul></li></ol><p>Marek calls corporations that do this well <em>RACERS</em>, which stands for Revenue Automation, Continuous Evolution, and Relationship Saturation.</p><p>While some rules only apply to business leaders, a few also apply to <a href=https://yanirseroussi.com/2023/04/21/remaining-relevant-as-a-small-language-model/>humans who are trying to remain economically relevant</a> – especially the rules around embracing automation and evolving continuously. The alternative is to be made redundant. For example, I recently saw a job ad for a copywriter position that required both the ability to produce excellent copy without the aid of AI, <em>and also</em> the ability to produce excellent copy four times faster with AI tools. The same applies – explicitly or implicitly – to any knowledge work that can now be sped up with automation.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/futurism/>Futurism</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on x" href="https://x.com/intent/tweet/?text=Adapting%20to%20the%20economy%20of%20algorithms&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f&amp;hashtags=artificialintelligence%2cbooks%2cbusiness%2ccareer%2cfuturism%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f&amp;title=Adapting%20to%20the%20economy%20of%20algorithms&amp;summary=Adapting%20to%20the%20economy%20of%20algorithms&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f&title=Adapting%20to%20the%20economy%20of%20algorithms"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on whatsapp" href="https://api.whatsapp.com/send?text=Adapting%20to%20the%20economy%20of%20algorithms%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on telegram" href="https://telegram.me/share/url?text=Adapting%20to%20the%20economy%20of%20algorithms&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Adapting to the economy of algorithms on ycombinator" href="https://news.ycombinator.com/submitlink?t=Adapting%20to%20the%20economy%20of%20algorithms&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f05%2f25%2fadapting-to-the-economy-of-algorithms%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/06/12/the-rules-of-the-passion-economy/index.html b/til/2024/06/12/the-rules-of-the-passion-economy/index.html
index d5719002d..896eedd1a 100644
--- a/til/2024/06/12/the-rules-of-the-passion-economy/index.html
+++ b/til/2024/06/12/the-rules-of-the-passion-economy/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="books,business,career,quotes"><meta name=description content="Summary of the main messages from the book The Passion Economy by Adam Davidson."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="The rules of the passion economy"><meta property="og:description" content="Summary of the main messages from the book The Passion Economy by Adam Davidson."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-06-12T02:50:00+00:00"><meta property="article:modified_time" content="2024-06-12T12:58:06+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="The rules of the passion economy"><meta name=twitter:description content="Summary of the main messages from the book The Passion Economy by Adam Davidson."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"The rules of the passion economy","item":"https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"The rules of the passion economy","name":"The rules of the passion economy","description":"Summary of the main messages from the book The Passion Economy by Adam Davidson.","keywords":["books","business","career","quotes"],"articleBody":"I recently read The Passion Economy by Adam Davidson. The book has some good stories, but I felt like it stretched the main idea a bit too much towards the end (especially the part that glorified Google).\nThat said, I liked the chapter about the rules of the passion economy, so I’m posting them here for future reference:\nPursue intimacy at scale: Identify what you love and do well, match your passion to those who want it, and listen to customer feedback. Only create value that can’t be easily copied. The price you charge should match the value you provide: Price drive costs, value is a conversation, passion pricing is a service, note your best alternative to a negotiated agreement, charge a lot and then earn it, pay may come in other ways than money, keep changing your prices and offerings, salary is a price, the price you charge should feel good to you, pricing low isn’t a strategy, and pricing is your value. Fewer passionate customers are better than a lot of indifferent ones: Value pricing requires selling to the right people, don’t rush into a niche too quickly, the best customers are those who seek you out (eventually), and passion/pricing/value/customers are all different views of the same thing. Passion is a story: You’re selling a story – it better be true, always tell the truth, you must tell your story – it is told in every detail of the business. Technology should always support your business, not drive it: Do what tech and large industry can’t do, tech-driven scale creates space for businesses built on value and passion, and tech tends toward bigness (so stay small). Know what business you’re in, and it’s probably not what you think: Change your value capture constantly and your value creation slowly. Never be in the commodity business, even if you sell what other people consider a commodity. Much of this aligns with lessons I’ve learned about running a solo business (see references at the top of my post on LinkedIn as a teachable skill). In fact, I learned about the book from an episode of The Business of Authority, a podcast that explored the same ground as the rules across years of inspiring episodes.\n","wordCount":"373","inLanguage":"en","datePublished":"2024-06-12T02:50:00Z","dateModified":"2024-06-12T12:58:06+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">The rules of the passion economy</h1><div class=post-meta><span title='2024-06-12 02:50:00 +0000 UTC'>June 12, 2024</span></div></header><div class=post-content><p>I recently read <a href=https://www.goodreads.com/book/show/45152042-the-passion-economy target=_blank rel=noopener>The Passion Economy by Adam Davidson</a>. The book has some good stories, but I felt like it stretched the main idea a bit too much towards the end (especially the part that glorified Google).</p><p>That said, I liked the chapter about the rules of the passion economy, so I&rsquo;m posting them here for future reference:</p><ol><li><strong>Pursue intimacy at scale:</strong> Identify what you love and do well, match your passion to those who want it, and listen to customer feedback.</li><li><strong>Only create value that can&rsquo;t be easily copied.</strong></li><li><strong>The price you charge should match the value you provide:</strong> Price drive costs, value is a conversation, passion pricing is a service, note your best alternative to a negotiated agreement, charge a lot and then earn it, pay may come in other ways than money, keep changing your prices and offerings, salary is a price, the price you charge should feel good to you, pricing low isn&rsquo;t a strategy, and pricing is your value.</li><li><strong>Fewer passionate customers are better than a lot of indifferent ones:</strong> Value pricing requires selling to the right people, don&rsquo;t rush into a niche too quickly, the best customers are those who seek you out (eventually), and passion/pricing/value/customers are all different views of the same thing.</li><li><strong>Passion is a story:</strong> You&rsquo;re selling a story – it better be true, always tell the truth, you must tell your story – it is told in every detail of the business.</li><li><strong>Technology should always support your business, not drive it:</strong> Do what tech and large industry can&rsquo;t do, tech-driven scale creates space for businesses built on value and passion, and tech tends toward bigness (so stay small).</li><li><strong>Know what business you&rsquo;re in, and it&rsquo;s probably not what you think:</strong> Change your value capture constantly and your value creation slowly.</li><li><strong>Never be in the commodity business, even if you sell what other people consider a commodity.</strong></li></ol><p>Much of this aligns with lessons I&rsquo;ve learned about running a solo business (see references at the top of <a href=https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/>my post on LinkedIn as a teachable skill</a>). In fact, I learned about the book from <a href=https://www.thebusinessofauthority.com/episodes/the-passion-economy-with-adam-davidson-replay target=_blank rel=noopener>an episode of The Business of Authority</a>, a podcast that explored the same ground as the rules across years of inspiring episodes.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/books/>Books</a></li><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on x" href="https://x.com/intent/tweet/?text=The%20rules%20of%20the%20passion%20economy&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f&amp;hashtags=books%2cbusiness%2ccareer%2cquotes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f&amp;title=The%20rules%20of%20the%20passion%20economy&amp;summary=The%20rules%20of%20the%20passion%20economy&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f&title=The%20rules%20of%20the%20passion%20economy"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on whatsapp" href="https://api.whatsapp.com/send?text=The%20rules%20of%20the%20passion%20economy%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on telegram" href="https://telegram.me/share/url?text=The%20rules%20of%20the%20passion%20economy&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share The rules of the passion economy on ycombinator" href="https://news.ycombinator.com/submitlink?t=The%20rules%20of%20the%20passion%20economy&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f12%2fthe-rules-of-the-passion-economy%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/06/22/dealing-with-endless-data-changes/index.html b/til/2024/06/22/dealing-with-endless-data-changes/index.html
index 44504f460..5dd7b1dca 100644
--- a/til/2024/06/22/dealing-with-endless-data-changes/index.html
+++ b/til/2024/06/22/dealing-with-endless-data-changes/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="artificial intelligence,data strategy,DevOps,machine learning,quotes,software engineering"><meta name=description content="Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Dealing with endless data changes"><meta property="og:description" content="Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-06-22T22:50:00+00:00"><meta property="article:modified_time" content="2024-06-23T08:52:50+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Dealing with endless data changes"><meta name=twitter:description content="Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Dealing with endless data changes","item":"https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Dealing with endless data changes","name":"Dealing with endless data changes","description":"Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data.","keywords":["artificial intelligence","data strategy","DevOps","machine learning","quotes","software engineering"],"articleBody":"I recently listened to the Super Data Science podcast episode on MLOps: The Job and The Key Tools (with Demetrios Brinkmann). The full episode is worth a listen, but one part that especially resonated with me is on how MLOps is about dealing with the unpredictable changes created by data. This is a key difference between working with machine learning in production and traditional applications that are less data-intensive.\nQuotes with minor edits:\nAnd now I think it’s very common for people to understand, okay, you need some kind of change management solutions when you’re playing around with data, when you’re doing things in ML, you need to make sure that if you’re going to put something into production, you test it in every way possible before it goes out there. Otherwise, you are left with that situation that can wind you up in the headlines. And you don’t want that. Nobody wants to be in the head headlines for bad AI uses.\n[…]\nI also think that MLOps is a subset of DevOps.\n[…]\nIt’s easy to get caught up in is thinking that MLOps is just implementing some tools and then you’re good. But really it’s that organizational level and having the reliability, having the ability to put things into production quickly and roll them back if you need to.\n[…]\nDevOps has a lot of change management. So when you make changes to code, you have a process and it’s very mature process that goes into that, right? You change code and then you have unit tests and you have integration tests, and you have somebody that’s merging the branch and maybe you look over it and so there’s a human in the loop and then you roll it out slowly and so you can do some kind of feature flags so that this new code goes into production and you make sure that it doesn’t take down the whole website or take down the whole app, whatever it is. Then you have other tools that can monitor if that new feature or that piece of code is actually being used by users.\nNone of that exists when it comes to data. Where is all that? So when it comes to data, data’s changing all the time and people are making changes to data all the time, whether it’s way upstream or downstream.\n[…]\nData’s changing continuously, but there’s no integration tests, there’s no unit tests, there’s no type of feature flag on rolling out these data changes. There’s no type of monitoring if the data is actually being used later on or the new data streams, it just can break things. And you see that in a broken ML model that now is making bad predictions or it’s going insanely slow for some reason, or it’s just not hitting the mark. Or you see it in a broken dashboard, you see it at the end product. And so it’s funny to me that, going back to DevOps and that whole idea of change management and having these processes in place so that when you do change something, you can still have the reliability that you are going to be able to push out this change and you don’t have to get a call at 03:00 AM.\nIn short, unlike software, data changes in ways you can only manage, rather than fully control. Organisations should recognise this reality and manage data-intensive applications accordingly by adopting MLOps and DataOps practices, in addition to traditional DevOps.\n","wordCount":"587","inLanguage":"en","datePublished":"2024-06-22T22:50:00Z","dateModified":"2024-06-23T08:52:50+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Dealing with endless data changes</h1><div class=post-meta><span title='2024-06-22 22:50:00 +0000 UTC'>June 22, 2024</span></div></header><div class=post-content><p>I recently listened to <a href=https://www.superdatascience.com/podcast/mlops-the-job-and-the-key-tools-with-demetrios-brinkmann target=_blank rel=noopener>the Super Data Science podcast episode on MLOps: The Job and The Key Tools (with Demetrios Brinkmann)</a>. The full episode is worth a listen, but one part that especially resonated with me is on how MLOps is about dealing with the unpredictable changes created by data. This is a key difference between working with machine learning in production and traditional applications that are less data-intensive.</p><p>Quotes with minor edits:</p><blockquote><p>And now I think it&rsquo;s very common for people to understand, okay, you need some kind of change management solutions when you&rsquo;re playing around with data, when you&rsquo;re doing things in ML, you need to make sure that if you&rsquo;re going to put something into production, you test it in every way possible before it goes out there. Otherwise, you are left with that situation that can wind you up in the headlines. And you don&rsquo;t want that. Nobody wants to be in the head headlines for bad AI uses.</p><p>[&mldr;]</p><p>I also think that MLOps is a subset of DevOps.</p><p>[&mldr;]</p><p>It&rsquo;s easy to get caught up in is thinking that MLOps is just implementing some tools and then you&rsquo;re good. But really it&rsquo;s that organizational level and having the reliability, having the ability to put things into production quickly and roll them back if you need to.</p><p>[&mldr;]</p><p>DevOps has a lot of change management. So when you make changes to code, you have a process and it&rsquo;s very mature process that goes into that, right? You change code and then you have unit tests and you have integration tests, and you have somebody that&rsquo;s merging the branch and maybe you look over it and so there&rsquo;s a human in the loop and then you roll it out slowly and so you can do some kind of feature flags so that this new code goes into production and you make sure that it doesn&rsquo;t take down the whole website or take down the whole app, whatever it is. Then you have other tools that can monitor if that new feature or that piece of code is actually being used by users.</p><p>None of that exists when it comes to data. Where is all that? So when it comes to data, data&rsquo;s changing all the time and people are making changes to data all the time, whether it&rsquo;s way upstream or downstream.</p><p>[&mldr;]</p><p>Data&rsquo;s changing continuously, but there&rsquo;s no integration tests, there&rsquo;s no unit tests, there&rsquo;s no type of feature flag on rolling out these data changes. There&rsquo;s no type of monitoring if the data is actually being used later on or the new data streams, it just can break things. And you see that in a broken ML model that now is making bad predictions or it&rsquo;s going insanely slow for some reason, or it&rsquo;s just not hitting the mark. Or you see it in a broken dashboard, you see it at the end product. And so it&rsquo;s funny to me that, going back to DevOps and that whole idea of change management and having these processes in place so that when you do change something, you can still have the reliability that you are going to be able to push out this change and you don&rsquo;t have to get a call at 03:00 AM.</p></blockquote><p>In short, <strong>unlike software, data changes in ways you can only manage, rather than fully control</strong>. Organisations should recognise this reality and manage data-intensive applications accordingly by adopting MLOps and DataOps practices, in addition to traditional DevOps.</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/artificial-intelligence/>Artificial Intelligence</a></li><li><a href=https://yanirseroussi.com/tags/data-strategy/>Data Strategy</a></li><li><a href=https://yanirseroussi.com/tags/devops/>DevOps</a></li><li><a href=https://yanirseroussi.com/tags/machine-learning/>Machine Learning</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li><li><a href=https://yanirseroussi.com/tags/software-engineering/>Software Engineering</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on x" href="https://x.com/intent/tweet/?text=Dealing%20with%20endless%20data%20changes&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f&amp;hashtags=artificialintelligence%2cdatastrategy%2cDevOps%2cmachinelearning%2cquotes%2csoftwareengineering"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f&amp;title=Dealing%20with%20endless%20data%20changes&amp;summary=Dealing%20with%20endless%20data%20changes&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f&title=Dealing%20with%20endless%20data%20changes"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on whatsapp" href="https://api.whatsapp.com/send?text=Dealing%20with%20endless%20data%20changes%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on telegram" href="https://telegram.me/share/url?text=Dealing%20with%20endless%20data%20changes&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Dealing with endless data changes on ycombinator" href="https://news.ycombinator.com/submitlink?t=Dealing%20with%20endless%20data%20changes&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f22%2fdealing-with-endless-data-changes%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/index.html b/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/index.html
index 6a39c123c..f0aaca074 100644
--- a/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/index.html
+++ b/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/index.html
@@ -2,7 +2,7 @@
 <meta name=keywords content="business,career,quotes,startups"><meta name=description content="Takeaways from an interview with Patty McCord on The Startup Podcast."><meta name=author content="Yanir Seroussi"><link rel=canonical href=https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/><meta name=google-site-verification content="aWlue7NGcj4dQpjOKJF7YKiAvw3JuHnq6aFqX6VwWAU"><link crossorigin=anonymous href=/assets/css/stylesheet.0139a50c6e3f53193500e07972ba88238beaf3384629640b34fa4fd38dc956f6.css integrity="sha256-ATmlDG4/Uxk1AOB5crqII4vq8zhGKWQLNPpP043JVvY=" rel="preload stylesheet" as=style><link rel=icon href=https://yanirseroussi.com/favicon.ico><link rel=icon type=image/png sizes=16x16 href=https://yanirseroussi.com/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=https://yanirseroussi.com/favicon-32x32.png><link rel=apple-touch-icon href=https://yanirseroussi.com/apple-touch-icon.png><link rel=mask-icon href=https://yanirseroussi.com/safari-pinned-tab.svg><meta name=theme-color content="#2e2e33"><meta name=msapplication-TileColor content="#2e2e33"><link rel=alternate hreflang=en href=https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/><noscript><style>#theme-toggle,.top-link{display:none}</style><style>@media(prefers-color-scheme:dark){:root{--theme:rgb(29, 30, 32);--entry:rgb(46, 46, 51);--primary:rgb(218, 218, 219);--secondary:rgb(155, 156, 157);--tertiary:rgb(65, 66, 68);--content:rgb(196, 196, 197);--code-block-bg:rgb(46, 46, 51);--code-bg:rgb(55, 56, 62);--border:rgb(51, 51, 51)}.list{background:var(--theme)}.list:not(.dark)::-webkit-scrollbar-track{background:0 0}.list:not(.dark)::-webkit-scrollbar-thumb{border-color:var(--theme)}}</style></noscript><meta property="og:title" content="Five team-building mistakes, according to Patty McCord"><meta property="og:description" content="Takeaways from an interview with Patty McCord on The Startup Podcast."><meta property="og:type" content="article"><meta property="og:url" content="https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/"><meta property="article:section" content="til"><meta property="article:published_time" content="2024-06-26T00:00:00+00:00"><meta property="article:modified_time" content="2024-06-26T10:45:15+10:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Five team-building mistakes, according to Patty McCord"><meta name=twitter:description content="Takeaways from an interview with Patty McCord on The Startup Podcast."><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"},{"@type":"ListItem","position":2,"name":"Five team-building mistakes, according to Patty McCord","item":"https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/"}]}</script><script type=application/ld+json>{"@context":"https://schema.org","@type":"BlogPosting","headline":"Five team-building mistakes, according to Patty McCord","name":"Five team-building mistakes, according to Patty McCord","description":"Takeaways from an interview with Patty McCord on The Startup Podcast.","keywords":["business","career","quotes","startups"],"articleBody":"I’ve heard about the legendary Patty McCord and her work as Chief Talent Officer at Netflix, but have never looked deeply into it.\nRecently, I listened to an interview with her on The Startup Podcast, titled Five Biggest Team Building Mistakes.\nThey all resonated:\nTolerating mediocre performance Not sharing business context with everybody on the team Creating a command and control culture Avoiding difficult conversations Practising HR theater Key quote (emphasis mine):\nI’m supposed to be this big innovator in my field, right? Oh, Patty, she’s the guru of HR. She’s reinvented… [but] I didn’t do anything radical. Here’s what I did that was radical.\nI stopped doing stupid stuff that doesn’t matter. Just stopped.\nWhy do we do this thing that everybody on God’s Earth does? Does it make any sense? Does it move the business forward? Does it make a customer happy? Does it increase our profit margins? And does it allow people to do their best work?\nAnd if I’m asking somebody who I’m paying quarter of a million dollars a year to go ask somebody in finance for permission to spend $5,000 when all the person in finance adds to the equation is, you know, oh, well, that’s $5,012, you can’t do that. They’re gonna spend their brain power figuring out how to work around somebody who doesn’t know what they’re doing in the finance organization rather than solving the problem we need to solve. I mean, it’s just like, what an utter waste of time for everybody.\nIf we all stopped doing stupid stuff that doesn’t matter, the world would definitely be a better place.\nGo check out the full episode, it’s gold!\n","wordCount":"278","inLanguage":"en","datePublished":"2024-06-26T00:00:00Z","dateModified":"2024-06-26T10:45:15+10:00","author":{"@type":"Person","name":"Yanir Seroussi"},"mainEntityOfPage":{"@type":"WebPage","@id":"https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/"},"publisher":{"@type":"Organization","name":"Yanir Seroussi | Data \u0026 AI for Startup Impact","logo":{"@type":"ImageObject","url":"https://yanirseroussi.com/favicon.ico"}}}</script></head><body id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><article class=post-single><header class=post-header><div class=breadcrumbs><a href=https://yanirseroussi.com/>Home</a>&nbsp;»&nbsp;<a href=https://yanirseroussi.com/til/>TIL: Today I learned...</a></div><h1 class="post-title entry-hint-parent">Five team-building mistakes, according to Patty McCord</h1><div class=post-meta><span title='2024-06-26 00:00:00 +0000 UTC'>June 26, 2024</span></div></header><div class=post-content><p>I&rsquo;ve heard about the legendary <a href=https://www.linkedin.com/in/pattymccord/ target=_blank rel=noopener>Patty McCord</a> and her work as Chief Talent Officer at Netflix, but have never looked deeply into it.</p><p>Recently, I listened to an interview with her on The Startup Podcast, titled <a href=https://www.tsp.show/5-biggest-team-building-mistakes-w-netflix-legend-patty-mccord/ target=_blank rel=noopener>Five Biggest Team Building Mistakes</a>.</p><p>They all resonated:</p><blockquote><ol><li>Tolerating mediocre performance</li><li>Not sharing business context with everybody on the team</li><li>Creating a command and control culture</li><li>Avoiding difficult conversations</li><li>Practising HR theater</li></ol></blockquote><p>Key quote (emphasis mine):</p><blockquote><p>I&rsquo;m supposed to be this big innovator in my field, right? Oh, Patty, she&rsquo;s the guru of HR. She&rsquo;s reinvented&mldr; [but] I didn&rsquo;t do anything radical. Here&rsquo;s what I did that was radical.</p><p><strong>I stopped doing stupid stuff that doesn&rsquo;t matter. Just stopped.</strong></p><p>Why do we do this thing that everybody on God&rsquo;s Earth does? Does it make any sense? Does it move the business forward? Does it make a customer happy? Does it increase our profit margins? And does it allow people to do their best work?</p><p>And if I&rsquo;m asking somebody who I&rsquo;m paying quarter of a million dollars a year to go ask somebody in finance for permission to spend $5,000 when all the person in finance adds to the equation is, you know, oh, well, that&rsquo;s $5,012, you can&rsquo;t do that. They&rsquo;re gonna spend their brain power figuring out how to work around somebody who doesn&rsquo;t know what they&rsquo;re doing in the finance organization rather than solving the problem we need to solve. I mean, it&rsquo;s just like, what an utter waste of time for everybody.</p></blockquote><p>If we all stopped doing stupid stuff that doesn&rsquo;t matter, the world would definitely be a better place.</p><p>Go check out <a href=https://www.tsp.show/5-biggest-team-building-mistakes-w-netflix-legend-patty-mccord/ target=_blank rel=noopener>the full episode</a>, it&rsquo;s gold!</p></div><footer class=post-footer><ul class=post-tags><li><a href=https://yanirseroussi.com/tags/business/>Business</a></li><li><a href=https://yanirseroussi.com/tags/career/>Career</a></li><li><a href=https://yanirseroussi.com/tags/quotes/>Quotes</a></li><li><a href=https://yanirseroussi.com/tags/startups/>Startups</a></li></ul><ul class=share-buttons><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on x" href="https://x.com/intent/tweet/?text=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f&amp;hashtags=business%2ccareer%2cquotes%2cstartups"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446C483.971.0 512 28.03 512 62.554zM269.951 190.75 182.567 75.216H56L207.216 272.95 63.9 436.783h61.366L235.9 310.383l96.667 126.4H456L298.367 228.367l134-153.151H371.033zM127.633 110h36.468l219.38 290.065H349.5z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on linkedin" href="https://www.linkedin.com/shareArticle?mini=true&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f&amp;title=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord&amp;summary=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord&amp;source=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM160.461 423.278V197.561h-75.04v225.717h75.04zm270.539.0V293.839c0-69.333-37.018-101.586-86.381-101.586-39.804.0-57.634 21.891-67.617 37.266v-31.958h-75.021c.995 21.181.0 225.717.0 225.717h75.02V297.222c0-6.748.486-13.492 2.474-18.315 5.414-13.475 17.767-27.434 38.494-27.434 27.135.0 38.007 20.707 38.007 51.037v120.768H431zM123.448 88.722C97.774 88.722 81 105.601 81 127.724c0 21.658 16.264 39.002 41.455 39.002h.484c26.165.0 42.452-17.344 42.452-39.002-.485-22.092-16.241-38.954-41.943-39.002z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on reddit" href="https://reddit.com/submit?url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f&title=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zM446 265.638c0-22.964-18.616-41.58-41.58-41.58-11.211.0-21.361 4.457-28.841 11.666-28.424-20.508-67.586-33.757-111.204-35.278l18.941-89.121 61.884 13.157c.756 15.734 13.642 28.29 29.56 28.29 16.407.0 29.706-13.299 29.706-29.701.0-16.403-13.299-29.702-29.706-29.702-11.666.0-21.657 6.792-26.515 16.578l-69.105-14.69c-1.922-.418-3.939-.042-5.585 1.036-1.658 1.073-2.811 2.761-3.224 4.686l-21.152 99.438c-44.258 1.228-84.046 14.494-112.837 35.232-7.468-7.164-17.589-11.591-28.757-11.591-22.965.0-41.585 18.616-41.585 41.58.0 16.896 10.095 31.41 24.568 37.918-.639 4.135-.99 8.328-.99 12.576.0 63.977 74.469 115.836 166.33 115.836s166.334-51.859 166.334-115.836c0-4.218-.347-8.387-.977-12.493 14.564-6.47 24.735-21.034 24.735-38.001zM326.526 373.831c-20.27 20.241-59.115 21.816-70.534 21.816-11.428.0-50.277-1.575-70.522-21.82-3.007-3.008-3.007-7.882.0-10.889 3.003-2.999 7.882-3.003 10.885.0 12.777 12.781 40.11 17.317 59.637 17.317 19.522.0 46.86-4.536 59.657-17.321 3.016-2.999 7.886-2.995 10.885.008 3.008 3.011 3.003 7.882-.008 10.889zm-5.23-48.781c-16.373.0-29.701-13.324-29.701-29.698.0-16.381 13.328-29.714 29.701-29.714 16.378.0 29.706 13.333 29.706 29.714.0 16.374-13.328 29.698-29.706 29.698zM160.91 295.348c0-16.381 13.328-29.71 29.714-29.71 16.369.0 29.689 13.329 29.689 29.71.0 16.373-13.32 29.693-29.689 29.693-16.386.0-29.714-13.32-29.714-29.693z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on facebook" href="https://facebook.com/sharer/sharer.php?u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H342.978V319.085h66.6l12.672-82.621h-79.272v-53.617c0-22.603 11.073-44.636 46.58-44.636H425.6v-70.34s-32.71-5.582-63.982-5.582c-65.288.0-107.96 39.569-107.96 111.204v62.971h-72.573v82.621h72.573V512h-191.104c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on whatsapp" href="https://api.whatsapp.com/send?text=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord%20-%20https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f"><svg viewBox="0 0 512 512" height="30" width="30" fill="currentcolor"><path d="M449.446.0C483.971.0 512 28.03 512 62.554v386.892C512 483.97 483.97 512 449.446 512H62.554c-34.524.0-62.554-28.03-62.554-62.554V62.554c0-34.524 28.029-62.554 62.554-62.554h386.892zm-58.673 127.703c-33.842-33.881-78.847-52.548-126.798-52.568-98.799.0-179.21 80.405-179.249 179.234-.013 31.593 8.241 62.428 23.927 89.612l-25.429 92.884 95.021-24.925c26.181 14.28 55.659 21.807 85.658 21.816h.074c98.789.0 179.206-80.413 179.247-179.243.018-47.895-18.61-92.93-52.451-126.81zM263.976 403.485h-.06c-26.734-.01-52.954-7.193-75.828-20.767l-5.441-3.229-56.386 14.792 15.05-54.977-3.542-5.637c-14.913-23.72-22.791-51.136-22.779-79.287.033-82.142 66.867-148.971 149.046-148.971 39.793.014 77.199 15.531 105.329 43.692 28.128 28.16 43.609 65.592 43.594 105.4-.034 82.149-66.866 148.983-148.983 148.984zm81.721-111.581c-4.479-2.242-26.499-13.075-30.604-14.571-4.105-1.495-7.091-2.241-10.077 2.241-2.986 4.483-11.569 14.572-14.182 17.562-2.612 2.988-5.225 3.364-9.703 1.12-4.479-2.241-18.91-6.97-36.017-22.23C231.8 264.15 222.81 249.484 220.198 245s-.279-6.908 1.963-9.14c2.016-2.007 4.48-5.232 6.719-7.847 2.24-2.615 2.986-4.484 4.479-7.472 1.493-2.99.747-5.604-.374-7.846-1.119-2.241-10.077-24.288-13.809-33.256-3.635-8.733-7.327-7.55-10.077-7.688-2.609-.13-5.598-.158-8.583-.158-2.986.0-7.839 1.121-11.944 5.604-4.105 4.484-15.675 15.32-15.675 37.364.0 22.046 16.048 43.342 18.287 46.332 2.24 2.99 31.582 48.227 76.511 67.627 10.685 4.615 19.028 7.371 25.533 9.434 10.728 3.41 20.492 2.929 28.209 1.775 8.605-1.285 26.499-10.833 30.231-21.295 3.732-10.464 3.732-19.431 2.612-21.298-1.119-1.869-4.105-2.99-8.583-5.232z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on telegram" href="https://telegram.me/share/url?text=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord&amp;url=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f"><svg viewBox="2 2 28 28" height="30" width="30" fill="currentcolor"><path d="M26.49 29.86H5.5a3.37 3.37.0 01-2.47-1 3.35 3.35.0 01-1-2.47V5.48A3.36 3.36.0 013 3 3.37 3.37.0 015.5 2h21A3.38 3.38.0 0129 3a3.36 3.36.0 011 2.46V26.37a3.35 3.35.0 01-1 2.47 3.38 3.38.0 01-2.51 1.02zm-5.38-6.71a.79.79.0 00.85-.66L24.73 9.24a.55.55.0 00-.18-.46.62.62.0 00-.41-.17q-.08.0-16.53 6.11a.59.59.0 00-.41.59.57.57.0 00.43.52l4 1.24 1.61 4.83a.62.62.0 00.63.43.56.56.0 00.4-.17L16.54 20l4.09 3A.9.9.0 0021.11 23.15zM13.8 20.71l-1.21-4q8.72-5.55 8.78-5.55c.15.0.23.0.23.16a.18.18.0 010 .06s-2.51 2.3-7.52 6.8z"/></svg></a></li><li><a target=_blank rel="noopener noreferrer" aria-label="share Five team-building mistakes, according to Patty McCord on ycombinator" href="https://news.ycombinator.com/submitlink?t=Five%20team-building%20mistakes%2c%20according%20to%20Patty%20McCord&u=https%3a%2f%2fyanirseroussi.com%2ftil%2f2024%2f06%2f26%2ffive-team-building-mistakes-according-to-patty-mccord%2f"><svg width="30" height="30" viewBox="0 0 512 512" fill="currentcolor" xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"><path d="M449.446.0C483.971.0 512 28.03 512 62.554V449.446C512 483.97 483.97 512 449.446 512H62.554C28.03 512 0 483.97.0 449.446V62.554C0 28.03 28.029.0 62.554.0H449.446zM183.8767 87.9921h-62.034L230.6673 292.4508V424.0079h50.6655V292.4508L390.1575 87.9921H328.1233L256 238.2489z"/></svg></a></li></ul></footer><a href=/contact/#mailing-list-email target=_blank aria-label="subscribe to mailing list" class=mailing-list-link id=mailing-list-link>Subscribe
 </a><script>const mailingListButton=document.getElementById("mailing-list-link");window.onscroll=function(){document.body.scrollTop>800||document.documentElement.scrollTop>800?(mailingListButton.style.visibility="visible",mailingListButton.style.opacity="1"):(mailingListButton.style.visibility="hidden",mailingListButton.style.opacity="0")}</script><div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div><section class=comment-section><p class="post-content contact-cta">Public comments are closed, but I love hearing from readers. Feel free to
 <a href=/contact/ target=_blank>contact me</a> with your thoughts.</p></section></article></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
diff --git a/til/index.html b/til/index.html
index dd343abb4..522d80d8d 100644
--- a/til/index.html
+++ b/til/index.html
@@ -11,7 +11,7 @@
 "><script type=application/ld+json>{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"TIL: Today I learned...","item":"https://yanirseroussi.com/til/"}]}</script></head><body class=list id=top><script>localStorage.getItem("pref-theme")==="dark"?document.body.classList.add("dark"):localStorage.getItem("pref-theme")==="light"?document.body.classList.remove("dark"):window.matchMedia("(prefers-color-scheme: dark)").matches&&document.body.classList.add("dark")</script><header class=header><nav class=nav><div class=logo><a href=https://yanirseroussi.com/ accesskey=h title="Yanir Seroussi | Data & AI for Startup Impact (Alt + H)">Yanir Seroussi | Data & AI for Startup Impact</a><div class=logo-switches><button id=theme-toggle accesskey=t title="(Alt + T)"><svg id="moon" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 12.79A9 9 0 1111.21 3 7 7 0 0021 12.79z"/></svg><svg id="sun" xmlns="http://www.w3.org/2000/svg" width="24" height="18" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="5"/><line x1="12" y1="1" x2="12" y2="3"/><line x1="12" y1="21" x2="12" y2="23"/><line x1="4.22" y1="4.22" x2="5.64" y2="5.64"/><line x1="18.36" y1="18.36" x2="19.78" y2="19.78"/><line x1="1" y1="12" x2="3" y2="12"/><line x1="21" y1="12" x2="23" y2="12"/><line x1="4.22" y1="19.78" x2="5.64" y2="18.36"/><line x1="18.36" y1="5.64" x2="19.78" y2="4.22"/></svg></button></div></div><button id=menu-trigger aria-haspopup=menu aria-label="Menu Button"><svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentcolor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="feather feather-menu"><line x1="3" y1="12" x2="21" y2="12"/><line x1="3" y1="6" x2="21" y2="6"/><line x1="3" y1="18" x2="21" y2="18"/></svg></button><ul class="menu hidden"><li><a href=https://yanirseroussi.com/about/ title=About><span>About</span></a></li><li><a href=https://yanirseroussi.com/posts/ title=Posts><span>Posts</span></a></li><li><a href=https://yanirseroussi.com/talks/ title=Talks><span>Talks</span></a></li><li><a href=https://yanirseroussi.com/consult/ title=Consulting><span>Consulting</span></a></li></ul></nav></header><main class=main><header class=page-header><h1>TIL: Today I learned...</h1><div class=post-description>Short, rough posts about things I learned, inspired by <a href=https://til.simonwillison.net/ target=_blank rel=noopener>Simon Willison&rsquo;s TIL</a>.
 Subscribe to the mailing list to receive a digest of recent TILs whenever I publish <a href=/posts/>longer-form posts</a>.<div class=mailing-list-container><script src=https://f.convertkit.com/ckjs/ck.5.js></script><form class="mailing-list seva-form formkit-form" action=https://app.convertkit.com/forms/6549537/subscriptions method=post data-sv-form=6549537 data-uid=9157759fce data-format=inline data-version=5 data-options='{"settings":{"after_subscribe":{"action":"message","redirect_url":"","success_message":"Success! Now check your email to confirm your subscription."},"recaptcha":{"enabled":false},"return_visitor":{"action":"show","custom_content":""}},"version":"5"}'><div data-style=clean><ul class="formkit-alert formkit-alert-error" data-element=errors data-group=alert></ul><div data-element=fields data-stacked=false><label for=mailing-list-email>Get weekly posts in your mailbox</label>
 <input id=mailing-list-email name=email_address aria-label="Email address" placeholder="Email address" required type=email>
-<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Alternatively, <a href=https://yanirseroussi.com/index.xml>subscribe to RSS feed</a>.</div></div></div></header><div class=archive-year><h2 class=archive-year-header>2024</h2><div class=archive-month><h3 class=archive-month-header>June</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Five team-building mistakes, according to Patty McCord</h3><a class=entry-link aria-label="post link to Five team-building mistakes, according to Patty McCord" href=https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/></a><div class=entry-content><p>Takeaways from an interview with Patty McCord on The Startup Podcast.</p></div><div class=archive-meta><span title='2024-06-26 00:00:00 +0000 UTC'>June 26, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Dealing with endless data changes</h3><a class=entry-link aria-label="post link to Dealing with endless data changes" href=https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/></a><div class=entry-content><p>Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data.</p></div><div class=archive-meta><span title='2024-06-22 22:50:00 +0000 UTC'>June 22, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The rules of the passion economy</h3><a class=entry-link aria-label="post link to The rules of the passion economy" href=https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/></a><div class=entry-content><p>Summary of the main messages from the book The Passion Economy by Adam Davidson.</p></div><div class=archive-meta><span title='2024-06-12 02:50:00 +0000 UTC'>June 12, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>May</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Adapting to the economy of algorithms</h3><a class=entry-link aria-label="post link to Adapting to the economy of algorithms" href=https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/></a><div class=entry-content><p>Overview of the book The Economy of Algorithms by Marek Kowalkiewicz.</p></div><div class=archive-meta><span title='2024-05-25 00:00:00 +0000 UTC'>May 25, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>April</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>LinkedIn is a teachable skill</h3><a class=entry-link aria-label="post link to LinkedIn is a teachable skill" href=https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/></a><div class=entry-content><p>An high-level overview of things I learned from Justin Welsh’s LinkedIn Operating System course.</p></div><div class=archive-meta><span title='2024-04-11 01:45:25 +0000 UTC'>April 11, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The data engineering lifecycle is not going anywhere</h3><a class=entry-link aria-label="post link to The data engineering lifecycle is not going anywhere" href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/></a><div class=entry-content><p>My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley.</p></div><div class=archive-meta><span title='2024-04-05 01:00:00 +0000 UTC'>April 5, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>March</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Atomic Habits is full of actionable advice</h3><a class=entry-link aria-label="post link to Atomic Habits is full of actionable advice" href=https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/></a><div class=entry-content><p>I put the book to use after the first listen, and will definitely revisit it in the future to form better habits.</p></div><div class=archive-meta><span title='2024-03-12 06:19:31 +0000 UTC'>March 12, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>February</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The three Cs of indie consulting: Confidence, Cash, and Connections</h3><a class=entry-link aria-label="post link to The three Cs of indie consulting: Confidence, Cash, and Connections" href=https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/></a><div class=entry-content><p>Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting.</p></div><div class=archive-meta><span title='2024-02-17 02:00:00 +0000 UTC'>February 17, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Future software development may require fewer humans</h3><a class=entry-link aria-label="post link to Future software development may require fewer humans" href=https://yanirseroussi.com/til/2024/02/06/future-software-development-may-require-fewer-humans/></a><div class=entry-content><p>Reflecting on an interview with Jason Warner, CEO of poolside.</p></div><div class=archive-meta><span title='2024-02-06 06:15:00 +0000 UTC'>February 6, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>January</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Psychographic specialisations may work for discipline generalists</h3><a class=entry-link aria-label="post link to Psychographic specialisations may work for discipline generalists" href=https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/></a><div class=entry-content><p>When focusing on a market segment defined by personal beliefs, it’s often fine to position yourself as a generalist in your craft.</p></div><div class=archive-meta><span title='2024-01-09 03:00:00 +0000 UTC'>January 9, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The power of parasocial relationships</h3><a class=entry-link aria-label="post link to The power of parasocial relationships" href=https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/></a><div class=entry-content><p>Repeated exposure to media personas creates relationships that help justify premium fees.</p></div><div class=archive-meta><span title='2024-01-08 06:00:00 +0000 UTC'>January 8, 2024</span></div></div></div></div></div><div class=archive-year><h2 class=archive-year-header>2023</h2><div class=archive-month><h3 class=archive-month-header>December</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Positioning is a common problem for data scientists</h3><a class=entry-link aria-label="post link to Positioning is a common problem for data scientists" href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/></a><div class=entry-content><p>With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark.</p></div><div class=archive-meta><span title='2023-12-18 00:30:00 +0000 UTC'>December 18, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Transfer learning applies to energy market bidding</h3><a class=entry-link aria-label="post link to Transfer learning applies to energy market bidding" href=https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/></a><div class=entry-content><p>An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland.</p></div><div class=archive-meta><span title='2023-12-14 00:15:00 +0000 UTC'>December 14, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>November</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Our Blue Machine is changing, but we are not helpless</h3><a class=entry-link aria-label="post link to Our Blue Machine is changing, but we are not helpless" href=https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/></a><div class=entry-content><p>One of my many highlights from Helen Czerski’s Blue Machine.</p></div><div class=archive-meta><span title='2023-11-28 06:40:00 +0000 UTC'>November 28, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>You don&rsquo;t need a proprietary API for static maps</h3><a class=entry-link aria-label="post link to You don't need a proprietary API for static maps" href=https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/></a><div class=entry-content><p>For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps.</p></div><div class=archive-meta><span title='2023-11-21 06:00:00 +0000 UTC'>November 21, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>October</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Artificial intelligence was a marketing term all along – just call it automation</h3><a class=entry-link aria-label="post link to Artificial intelligence was a marketing term all along – just call it automation" href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/></a><div class=entry-content><p>Replacing ‘artificial intelligence’ with ‘automation’ is a useful trick for cutting through the hype.</p></div><div class=archive-meta><span title='2023-10-06 05:00:00 +0000 UTC'>October 6, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>September</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The lines between solo consulting and product building are blurry</h3><a class=entry-link aria-label="post link to The lines between solo consulting and product building are blurry" href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/></a><div class=entry-content><p>It turns out that problems like finding a niche and defining the ideal clients are key to any solo business.</p></div><div class=archive-meta><span title='2023-09-25 00:00:00 +0000 UTC'>September 25, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Google&rsquo;s Rules of Machine Learning still apply in the age of large language models</h3><a class=entry-link aria-label="post link to Google's Rules of Machine Learning still apply in the age of large language models" href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/></a><div class=entry-content><p>Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.</p></div><div class=archive-meta><span title='2023-09-21 21:30:00 +0000 UTC'>September 21, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>August</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The Minimalist Entrepreneur is too prescriptive for me</h3><a class=entry-link aria-label="post link to The Minimalist Entrepreneur is too prescriptive for me" href=https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/></a><div class=entry-content><p>While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder’s experience.</p></div><div class=archive-meta><span title='2023-08-21 03:15:00 +0000 UTC'>August 21, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Revisiting Start Small, Stay Small in 2023 (Chapter 2)</h3><a class=entry-link aria-label="post link to Revisiting Start Small, Stay Small in 2023 (Chapter 2)" href=https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/></a><div class=entry-content><p>A summary of the second chapter of Rob Walling’s Start Small, Stay Small, along with my thoughts & reflections.</p></div><div class=archive-meta><span title='2023-08-17 07:45:00 +0000 UTC'>August 17, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Revisiting Start Small, Stay Small in 2023 (Chapter 1)</h3><a class=entry-link aria-label="post link to Revisiting Start Small, Stay Small in 2023 (Chapter 1)" href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/></a><div class=entry-content><p>A summary of the first chapter of Rob Walling’s Start Small, Stay Small, along with my thoughts & reflections.</p></div><div class=archive-meta><span title='2023-08-16 05:45:00 +0000 UTC'>August 16, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Email notifications on public GitHub commits</h3><a class=entry-link aria-label="post link to Email notifications on public GitHub commits" href=https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/></a><div class=entry-content><p>GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits.</p></div><div class=archive-meta><span title='2023-08-14 05:15:00 +0000 UTC'>August 14, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The rule of thirds can probably be ignored</h3><a class=entry-link aria-label="post link to The rule of thirds can probably be ignored" href=https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/></a><div class=entry-content><p>Turns out that the rule of thirds for composing visuals may not be that important.</p></div><div class=archive-meta><span title='2023-08-11 03:15:00 +0000 UTC'>August 11, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>July</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Using YubiKey for SSH access</h3><a class=entry-link aria-label="post link to Using YubiKey for SSH access" href=https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/></a><div class=entry-content><p>Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04.</p></div><div class=archive-meta><span title='2023-07-23 00:07:15 +0000 UTC'>July 23, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Making a TIL section with Hugo and PaperMod</h3><a class=entry-link aria-label="post link to Making a TIL section with Hugo and PaperMod" href=https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/></a><div class=entry-content><p>How I added a Today I Learned section to my Hugo site with the PaperMod theme.</p></div><div class=archive-meta><span title='2023-07-17 00:06:15 +0000 UTC'>July 17, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>You can&rsquo;t save time</h3><a class=entry-link aria-label="post link to You can't save time" href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/></a><div class=entry-content><p>Time can be spent doing different activities, but it can’t be stored and saved for later.</p></div><div class=archive-meta><span title='2023-07-11 00:00:00 +0000 UTC'>July 11, 2023</span></div></div></div></div></div></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
+<button data-element=submit>Subscribe</button></div></div></form><div class=footer>Join hundreds of subscribers. No spam or AI-generated slop. Unsubscribe any time.</div></div></div></header><div class=archive-year><h2 class=archive-year-header>2024</h2><div class=archive-month><h3 class=archive-month-header>June</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Five team-building mistakes, according to Patty McCord</h3><a class=entry-link aria-label="post link to Five team-building mistakes, according to Patty McCord" href=https://yanirseroussi.com/til/2024/06/26/five-team-building-mistakes-according-to-patty-mccord/></a><div class=entry-content><p>Takeaways from an interview with Patty McCord on The Startup Podcast.</p></div><div class=archive-meta><span title='2024-06-26 00:00:00 +0000 UTC'>June 26, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Dealing with endless data changes</h3><a class=entry-link aria-label="post link to Dealing with endless data changes" href=https://yanirseroussi.com/til/2024/06/22/dealing-with-endless-data-changes/></a><div class=entry-content><p>Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data.</p></div><div class=archive-meta><span title='2024-06-22 22:50:00 +0000 UTC'>June 22, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The rules of the passion economy</h3><a class=entry-link aria-label="post link to The rules of the passion economy" href=https://yanirseroussi.com/til/2024/06/12/the-rules-of-the-passion-economy/></a><div class=entry-content><p>Summary of the main messages from the book The Passion Economy by Adam Davidson.</p></div><div class=archive-meta><span title='2024-06-12 02:50:00 +0000 UTC'>June 12, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>May</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Adapting to the economy of algorithms</h3><a class=entry-link aria-label="post link to Adapting to the economy of algorithms" href=https://yanirseroussi.com/til/2024/05/25/adapting-to-the-economy-of-algorithms/></a><div class=entry-content><p>Overview of the book The Economy of Algorithms by Marek Kowalkiewicz.</p></div><div class=archive-meta><span title='2024-05-25 00:00:00 +0000 UTC'>May 25, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>April</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>LinkedIn is a teachable skill</h3><a class=entry-link aria-label="post link to LinkedIn is a teachable skill" href=https://yanirseroussi.com/til/2024/04/11/linkedin-is-a-teachable-skill/></a><div class=entry-content><p>An high-level overview of things I learned from Justin Welsh’s LinkedIn Operating System course.</p></div><div class=archive-meta><span title='2024-04-11 01:45:25 +0000 UTC'>April 11, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The data engineering lifecycle is not going anywhere</h3><a class=entry-link aria-label="post link to The data engineering lifecycle is not going anywhere" href=https://yanirseroussi.com/til/2024/04/05/the-data-engineering-lifecycle-is-not-going-anywhere/></a><div class=entry-content><p>My key takeaways from reading Fundamentals of Data Engineering by Joe Reis and Matt Housley.</p></div><div class=archive-meta><span title='2024-04-05 01:00:00 +0000 UTC'>April 5, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>March</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Atomic Habits is full of actionable advice</h3><a class=entry-link aria-label="post link to Atomic Habits is full of actionable advice" href=https://yanirseroussi.com/til/2024/03/12/atomic-habits-is-full-of-actionable-advice/></a><div class=entry-content><p>I put the book to use after the first listen, and will definitely revisit it in the future to form better habits.</p></div><div class=archive-meta><span title='2024-03-12 06:19:31 +0000 UTC'>March 12, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>February</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The three Cs of indie consulting: Confidence, Cash, and Connections</h3><a class=entry-link aria-label="post link to The three Cs of indie consulting: Confidence, Cash, and Connections" href=https://yanirseroussi.com/til/2024/02/17/the-three-cs-of-indie-consulting-confidence-cash-and-connections/></a><div class=entry-content><p>Jonathan Stark makes a compelling argument why you should have the three Cs before quitting your job to go solo consulting.</p></div><div class=archive-meta><span title='2024-02-17 02:00:00 +0000 UTC'>February 17, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Future software development may require fewer humans</h3><a class=entry-link aria-label="post link to Future software development may require fewer humans" href=https://yanirseroussi.com/til/2024/02/06/future-software-development-may-require-fewer-humans/></a><div class=entry-content><p>Reflecting on an interview with Jason Warner, CEO of poolside.</p></div><div class=archive-meta><span title='2024-02-06 06:15:00 +0000 UTC'>February 6, 2024</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>January</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Psychographic specialisations may work for discipline generalists</h3><a class=entry-link aria-label="post link to Psychographic specialisations may work for discipline generalists" href=https://yanirseroussi.com/til/2024/01/09/psychographic-specialisations-may-work-for-discipline-generalists/></a><div class=entry-content><p>When focusing on a market segment defined by personal beliefs, it’s often fine to position yourself as a generalist in your craft.</p></div><div class=archive-meta><span title='2024-01-09 03:00:00 +0000 UTC'>January 9, 2024</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The power of parasocial relationships</h3><a class=entry-link aria-label="post link to The power of parasocial relationships" href=https://yanirseroussi.com/til/2024/01/08/the-power-of-parasocial-relationships/></a><div class=entry-content><p>Repeated exposure to media personas creates relationships that help justify premium fees.</p></div><div class=archive-meta><span title='2024-01-08 06:00:00 +0000 UTC'>January 8, 2024</span></div></div></div></div></div><div class=archive-year><h2 class=archive-year-header>2023</h2><div class=archive-month><h3 class=archive-month-header>December</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Positioning is a common problem for data scientists</h3><a class=entry-link aria-label="post link to Positioning is a common problem for data scientists" href=https://yanirseroussi.com/til/2023/12/18/positioning-is-a-common-problem-for-data-scientists/></a><div class=entry-content><p>With the commodification of data scientists, the problem of positioning has become more common: My takeaways from Genevieve Hayes interviewing Jonathan Stark.</p></div><div class=archive-meta><span title='2023-12-18 00:30:00 +0000 UTC'>December 18, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Transfer learning applies to energy market bidding</h3><a class=entry-link aria-label="post link to Transfer learning applies to energy market bidding" href=https://yanirseroussi.com/til/2023/12/14/transfer-learning-applies-to-energy-market-bidding/></a><div class=entry-content><p>An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland.</p></div><div class=archive-meta><span title='2023-12-14 00:15:00 +0000 UTC'>December 14, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>November</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Our Blue Machine is changing, but we are not helpless</h3><a class=entry-link aria-label="post link to Our Blue Machine is changing, but we are not helpless" href=https://yanirseroussi.com/til/2023/11/28/our-blue-machine-is-changing-but-we-are-not-helpless/></a><div class=entry-content><p>One of my many highlights from Helen Czerski’s Blue Machine.</p></div><div class=archive-meta><span title='2023-11-28 06:40:00 +0000 UTC'>November 28, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>You don&rsquo;t need a proprietary API for static maps</h3><a class=entry-link aria-label="post link to You don't need a proprietary API for static maps" href=https://yanirseroussi.com/til/2023/11/21/you-dont-need-a-proprietary-api-for-static-maps/></a><div class=entry-content><p>For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps.</p></div><div class=archive-meta><span title='2023-11-21 06:00:00 +0000 UTC'>November 21, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>October</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Artificial intelligence was a marketing term all along – just call it automation</h3><a class=entry-link aria-label="post link to Artificial intelligence was a marketing term all along – just call it automation" href=https://yanirseroussi.com/til/2023/10/06/artificial-intelligence-was-a-marketing-term-all-along-just-call-it-automation/></a><div class=entry-content><p>Replacing ‘artificial intelligence’ with ‘automation’ is a useful trick for cutting through the hype.</p></div><div class=archive-meta><span title='2023-10-06 05:00:00 +0000 UTC'>October 6, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>September</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The lines between solo consulting and product building are blurry</h3><a class=entry-link aria-label="post link to The lines between solo consulting and product building are blurry" href=https://yanirseroussi.com/til/2023/09/25/the-lines-between-solo-consulting-and-product-building-are-blurry/></a><div class=entry-content><p>It turns out that problems like finding a niche and defining the ideal clients are key to any solo business.</p></div><div class=archive-meta><span title='2023-09-25 00:00:00 +0000 UTC'>September 25, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Google&rsquo;s Rules of Machine Learning still apply in the age of large language models</h3><a class=entry-link aria-label="post link to Google's Rules of Machine Learning still apply in the age of large language models" href=https://yanirseroussi.com/til/2023/09/21/googles-rules-of-machine-learning-still-apply-in-the-age-of-large-language-models/></a><div class=entry-content><p>Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.</p></div><div class=archive-meta><span title='2023-09-21 21:30:00 +0000 UTC'>September 21, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>August</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>The Minimalist Entrepreneur is too prescriptive for me</h3><a class=entry-link aria-label="post link to The Minimalist Entrepreneur is too prescriptive for me" href=https://yanirseroussi.com/til/2023/08/21/the-minimalist-entrepreneur-is-too-prescriptive-for-me/></a><div class=entry-content><p>While I found the story of Gumroad interesting, The Minimalist Entrepreneur seems to over-generalise from the founder’s experience.</p></div><div class=archive-meta><span title='2023-08-21 03:15:00 +0000 UTC'>August 21, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Revisiting Start Small, Stay Small in 2023 (Chapter 2)</h3><a class=entry-link aria-label="post link to Revisiting Start Small, Stay Small in 2023 (Chapter 2)" href=https://yanirseroussi.com/til/2023/08/17/revisiting-start-small-stay-small-in-2023-chapter-2/></a><div class=entry-content><p>A summary of the second chapter of Rob Walling’s Start Small, Stay Small, along with my thoughts & reflections.</p></div><div class=archive-meta><span title='2023-08-17 07:45:00 +0000 UTC'>August 17, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Revisiting Start Small, Stay Small in 2023 (Chapter 1)</h3><a class=entry-link aria-label="post link to Revisiting Start Small, Stay Small in 2023 (Chapter 1)" href=https://yanirseroussi.com/til/2023/08/16/revisiting-start-small-stay-small-in-2023-chapter-1/></a><div class=entry-content><p>A summary of the first chapter of Rob Walling’s Start Small, Stay Small, along with my thoughts & reflections.</p></div><div class=archive-meta><span title='2023-08-16 05:45:00 +0000 UTC'>August 16, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Email notifications on public GitHub commits</h3><a class=entry-link aria-label="post link to Email notifications on public GitHub commits" href=https://yanirseroussi.com/til/2023/08/14/email-notifications-on-public-github-commits/></a><div class=entry-content><p>GitHub publishes an Atom feed, which means you can use any RSS reader to follow commits.</p></div><div class=archive-meta><span title='2023-08-14 05:15:00 +0000 UTC'>August 14, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>The rule of thirds can probably be ignored</h3><a class=entry-link aria-label="post link to The rule of thirds can probably be ignored" href=https://yanirseroussi.com/til/2023/08/11/the-rule-of-thirds-can-probably-be-ignored/></a><div class=entry-content><p>Turns out that the rule of thirds for composing visuals may not be that important.</p></div><div class=archive-meta><span title='2023-08-11 03:15:00 +0000 UTC'>August 11, 2023</span></div></div></div></div><div class=archive-month><h3 class=archive-month-header>July</h3><div class=archive-posts><div class=archive-entry><h3 class=archive-entry-title>Using YubiKey for SSH access</h3><a class=entry-link aria-label="post link to Using YubiKey for SSH access" href=https://yanirseroussi.com/til/2023/07/23/using-yubikey-for-ssh-access/></a><div class=entry-content><p>Some pointers for setting up SSH access with YubiKey on Ubuntu 22.04.</p></div><div class=archive-meta><span title='2023-07-23 00:07:15 +0000 UTC'>July 23, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>Making a TIL section with Hugo and PaperMod</h3><a class=entry-link aria-label="post link to Making a TIL section with Hugo and PaperMod" href=https://yanirseroussi.com/til/2023/07/17/making-a-til-section-with-hugo-and-papermod/></a><div class=entry-content><p>How I added a Today I Learned section to my Hugo site with the PaperMod theme.</p></div><div class=archive-meta><span title='2023-07-17 00:06:15 +0000 UTC'>July 17, 2023</span></div></div><div class=archive-entry><h3 class=archive-entry-title>You can&rsquo;t save time</h3><a class=entry-link aria-label="post link to You can't save time" href=https://yanirseroussi.com/til/2023/07/11/you-cant-save-time/></a><div class=entry-content><p>Time can be spent doing different activities, but it can’t be stored and saved for later.</p></div><div class=archive-meta><span title='2023-07-11 00:00:00 +0000 UTC'>July 11, 2023</span></div></div></div></div></div></main><div class=global-footer><div class=footer><span>Text and figures licensed under <a href=https://creativecommons.org/licenses/by-nc-nd/4.0/ target=_blank rel=noopener>CC BY-NC-ND 4.0</a> by <a href=https://yanirseroussi.com/about/>Yanir Seroussi</a>, except where noted otherwise&nbsp;&nbsp;|</span>
 <span>Powered by
 <a href=https://gohugo.io/ rel="noopener noreferrer" target=_blank>Hugo</a> &
       <a href=https://github.com/adityatelange/hugo-PaperMod/ rel=noopener target=_blank>PaperMod</a></span></div></div><script>const menuTrigger=document.querySelector("#menu-trigger"),menuElem=document.querySelector(".menu");menuTrigger.addEventListener("click",function(){menuElem.classList.toggle("hidden")}),document.body.addEventListener("click",function(e){menuTrigger.contains(e.target)||menuElem.classList.add("hidden")})</script><script>let menu=document.getElementById("menu");menu&&(menu.scrollLeft=localStorage.getItem("menu-scroll-position"),menu.onscroll=function(){localStorage.setItem("menu-scroll-position",menu.scrollLeft)}),document.querySelectorAll('a[href^="#"]').forEach(e=>{e.addEventListener("click",function(e){e.preventDefault();var t=this.getAttribute("href").substr(1);window.matchMedia("(prefers-reduced-motion: reduce)").matches?document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView():document.querySelector(`[id='${decodeURIComponent(t)}']`).scrollIntoView({behavior:"smooth"}),t==="top"?history.replaceState(null,null," "):history.pushState(null,null,`#${t}`)})})</script><script>document.getElementById("theme-toggle").addEventListener("click",()=>{document.body.className.includes("dark")?(document.body.classList.remove("dark"),localStorage.setItem("pref-theme","light")):(document.body.classList.add("dark"),localStorage.setItem("pref-theme","dark"))})</script></body></html>
\ No newline at end of file