Introducing Leadbay first model and Early Results: Leadbay

Jan 1, 2024

January 2024, Milan Stankovic, Phd, Co-founder at Leadbay


This article introduces Leadbay 1st domain-specific model for B2B sales prospection. The first capabilities pertain to lead generation, lead scoring, and ideal customer profile refinement. The leadbay research team consists of Milan Stankovic, Phd - leveraging information-poor data for AI (60+ publications), a little tech team and a small group for B2B sales leaders from Google, Microsoft, Contentsquare, and Scaleway. Read our manifesto here.


1. Methods

a. Technical Challenge

Recent advances in AI have not been meaningfully leveraged in B2B sales yet. No sales team has adequate data for AI predictions.

To succeed in B2B sales, the system must be very good at:

  • Leverageing external (public) data,

to know who sells { what, how, where, to whom} and who needs {what, when, why} to {produce, distribute, sell} what.

  • Leveraging small, and messy data about past wins/losses.

Go beyond the (“💩 Garbage in = 💩 Garbage out”) limitation.

  • Leveraging users data to imitate the reps instinct.

b. Leadbay models

Leadbay models and algorithms are fruit of:

  1. Fine-tuning of existing open-source models

  2. Internal companies knowledge

  3. Crawling public data

c. Introducing leadbay first model

Leadbay develops a pre-trained model to generate an abstract (vector) representation of leads. This abstract representation is then used to perform a variety of prediction tasks (estimating the likelihood of a won / lost status, finding similar leads, spotting changes in lead closing tendencies that are not observable to the naked eye).


The intuition behind this approach is that, a large model, pre-trained on a global corpus of data, is representative of the “collective subconscious” - i.e., patterns of expression of the general human. Our goal is to make sense of sales history data that is often limited to only a company name, and understand why particular leads buy from our client while others don’t. Given this information-poor form, the historical data lacks great predictive power.


We augment the sales history data with textual information found on the web and other public data sources. That way, Leadbay is able to generate rich representations of leads that have the capacity, given a previous sales history, to predict the affinity of future potential leads to the products and services offered by our clients.


Leveraging the 'collective subconscious' to perform this task is similar to an experienced salesperson who, after some experience, can intuitively guess the likelihood of closing a client. Leadbay COMPASS now helps them do it faster, saving time on tedious information gathering tasks and offering a unique possibility to find the most performing sales effort direction, hand-in-hand with AI.

Performance metrics

a. Lead Status Prediction: Predictive power of the lead vector

Here we are interested to see how our vectors perform when a very basic classification approach is applied on them. We first calculate vector centroids for the sets of lead vectors associated with status won and then the same for lost. For each lead, we calculate cosine similarity with each of the centroids. We take argmax of those two similarities to be the predicted status.


Comparing the true statuses and the predicted statuses, we can calculate several metrics:

  • General Accuracy: number of leads with correctly predicted statuses / total number of predictions

  • Precision: the number of correctly predicted closed-won statuses divided by the total number leads predicted as of closed-won.

  • Recall: the number of correctly predicted closed-won statuses divided by the total number of leads associated with closed-won in the real data.


b. Anonymous customers - results

Pipeline data - prior classification by Leadbay

Pipeline - classified and augmented by Leadbay

Leadbay isolates leads that will close (magenta) / that will be lost (yellow). It is accurate at 96% based on CRM history data.

c. Anonymous customer - early analysis


From our CRM history dataset sample, we understand that:

  • B2B Sales Reps waste 70% of their time on lead that will never close*

  • Leadbay model correctly predicts in 96% of the cases the right lead win/loss outcome


Let’s do some (brutal) maths

  • If you take a team of 20 sales reps x 80,000€/annual cost for the company x 70% = €1,1M is wasted by the company on leads that will never close.

  • You can now x3 your team efficiency

d. Limitations

  • Limited to french and US companies. To validate the value proposition, we limited our R&D only on french companies, but the whole product is made to scale globally. As soon as we validate our work on the French market, it will be fast to include UK, German companies as an input of our models.


  • A lead is for us a company / not a people. During our market research, we validated with our target market (B2B sales selling to 50-500 employee companies) that the most important was to target the right company, at the right time, with the right context. Moreover, most of the information needed to do an efficient job was about the company and not about the people. It is especially the case for our target market targeting mid-market companies where the most important is the company and not the people (as it could be for Enterprise). In the future, it won’t be much difficult to add granularity on people.


*B2B Sales Reps waste 67% of their time on lead that will never close - is a figure calculated over several CRM historical datasets. This data has been confirmed by several studies including Steven Tulman, LeadMonk, Zippia