M&Ms: The Bot Wars
How recent advancements in technology may soon play out on the 85th edition.
Members of the operations team came running across the 8th floor, shouting, "turn off all the third-party products now!"
The software engineers responsible for what products appeared on our eCommerce site sat across from me. They were as shocked as I was to hear the screams of usually friendly professional business people.
As an eCommerce marketplace, third-party products represented the bulk of our catalog. What would happen to our sales after we turn all that off?
But after a few seconds, the shock wore off, and an Engineer on the Nova team asked for verbal approval "we will Doom all third-party SKUs off the site; this will take an hour or two to propagate. Is this what you want?"
One of the people from operations shouted, "Yes, it's all f*&%$d; take it all down."
Our teams and systems were named after superheroes. A typical software engineering thing to do because naming things is hard, and superheroes are fun. And Nova was in charge of the product catalog, and to Dr. Doom products meant to nuke them; it meant to take them off-site.
The reason the catalog team had to Doom things this particular time had to do with our competitive intelligence system. A team had built a system in-house to keep track of our competitor's prices in real time. It's a little-known secret in eCommerce, but every major shop does it. It's how Amazon can undercut competitors in near real-time on products and categories it deems very important. Sometimes they will even lose money to undercut them; the margins will presumably be made up on AWS or their ads platform. And all of this is legal. The court rulings on this more or less say that it is like standing in front of a shop on the street and taking a picture of their prices, especially if none of that info is behind password-protected pages.
But, it was pretty much unheard of for a small startup to be able to scrape the big guys.
Big eCommerce players employ sophisticated bot protection. Some of that bot protection the end user sees, e.g., in the form of Captchas, but there are other techniques. One technique is to send you to special pages with bad information. Another technique, for example, Amazon patented a way to introduce exponential slowdowns the harder you hit them. This makes it look like something is wrong with your internet connection, making it hard to detect that they've detected you. Are pages loading slower because you have a connection issue or because they think you are a bot? Hard to tell.
But at the same time, Amazon makes damn sure their own bots can scrape their competitors and go undetected. This is evident by their ability to undercut competitors. This is the game of bots vs. bots playing out every millisecond that goes by that few people are privy to. Bots impact more than just social media replies; they impact the prices you pay for things, the availability of products, and so much more. And without the capability to deploy them, new eCommerce entrants have no chance of competing with Amazon. And it is hard to do, which is why so few try.
But back at our offices in early 2016, with our Dooming issue, a competitor had gone nuclear and placed captchas on everything. This meant none of our IPs used for scraping could get past it. When an IP is listed as a bot, it's very hard to get it off that list. You can't just change your IP, it's hard to get access to real new IPs that look like customers, which is what you need to see real product information and up-to-date prices.
Our competitive intelligence system to do the scraping was called Spider-man internally, unironically. And that day the operations business folks explained to the Nova engineers that "spider-man was down, and we are running blind and bleeding money."
Listening to their conversation quietly, I knew this was a huge deal and likely to hurt business. But I had my own bots to worry about; my "bots" were sending real-time price information and availability information to advertisers like Google and FB from systems I had built. And as those operations people spoke my bots were running Ads on products that would soon no longer be available for purchase on our site. But it's worse than that because it also means we paid Google a dollar or two for each Ad that sold nothing. A double whammy. And bleeding money is how you go out of business and definitely miss out on any hope of being promoted. So I frantically checked what was happening with my "bots."
The root cause analysis of what happened to Spider-man, the scraping bots, showed that a new type of captcha was running in the wild. Captchas are like a double-edged sword, yes, they try and stop the bots, but they also annoy the hell out of customers. No human likes solving captchas, so eCommerce platforms use them sparingly. But something was up. Our pricing and competitive intelligence teams had caught the eye of Sauron. There was a lot of media attention on us and how fancy our tech was. And that eye was hell-bent on hurting us.
But it is at this point that I want you to know that building bots is an arms race.
When you throw up a fancy new technique, system, or bot, you are daring smart people on the other side to come up with an even fancier way to beat you. And back then, I worked with some of the smartest people I have ever met in my life.
And sure enough, the Spider-man team had figured it out. A new system, code-named Cyper, was needed that used AI and ML to solve captchas. If the captcha is the A-bomb, pulling out Cyper is akin to pulling out the H-bomb. It used optical character recognition to detect even malformed characters and other fancy techniques to defeat captchas. Once Cypher beat a captcha, that IP was good for a long time. My brother worked on the Spider-man team at the time. His promotion to Staff Engineer was largely credited to helping build Cypher and putting us back in business.
I hope you've been paying attention; our scraping system, Spider-man, could now scrape competitors in real time even when they threw captchas up because Cypher would kick the crap out of them. Spider-man and Cypher did this so price info could be funneled to Superman, which would make decisions on what to price things on the site. And all so Nova could turn third-party products back on. And so Storm, my team, could turn the advertising feeds back on too, and could pump traffic to the site.
And all of this was happening at scale in late 2015; imagine what the bots are doing now in 2022.
But no matter what the bots are doing now. And no matter how we use new innovations I want you to rest assured that Newton's third law stands strong; "For every action, there is an equal and opposite reaction." You can be sure that those impacted by that tech will find help or build their own bots and strike back.
And now I leave you with Yoda’s famous words, "Begun the [Bot] war has."
Before I share the tweets I usually share, I want to take a second and thank you. Thank you for being in this newsletter with me, for trusting me to be in your inbox, and for giving me a few minutes of your attention each week.
As an entrepreneur just starting out, it means a lot to me to have a direct relationship that can't be easily taken away. And especially these days with so many crazy things happening on Twitter.
Three Tweets: Biases, Work, and Selling Goods
What I love most about Paul Graham is that he is an independent thinker.
And regardless if I agree with his views compared to most people he tends to be objective.
And being born in communism, like I was, you realize independent thinking takes courage. Independent thinking is one of the scarcest resources we've got. Easy to see why most people want to fit in with a group and be liked.
Anyway, today, Paul tweeted that he was taking a break from Twitter due to a new policy on sharing links to other social networks. And he was banned for a few hours for it due to that new policy, even though he did not share a link at all.
Elon and Twitter quickly rolled back that policy and unbanned Paul, but the damage is done.
A deeply troubling thread that exposes a trend of men in the workforce.
I’ve been following Moses (and Dave) for some time and this short thread is an excellent reminder.
One Slide: The Disorder Brothers
Nicholas Nassim Taleb says how you react to The Disorder Brothers determines how fragile or antifragile you or anything is.
He also says if you like one thing on the list, you like them all, and he's spent a lot of time thinking through that.
That is, if you like uncertainty, then you also like time because time brings with it uncertainty.
Two Memes: Which iPhone, FB reacts
Let’s just say the iPhones aren’t changing a whole lot lately.
Facebook’s reaction to Twitter’s rule banning links out to platforms like Facebook was pretty good today.
As always, thank you for reading.
-Louie
P.S. you can reply to this email; it will get to me, and I will read it.
Reads like a thriller!
Fascinating!