Detecting Rare Objects with Fewer Training Instances

Vu Tran

Principal Data Scientist, Property Analytics

AI enables doing ‘more’ with ‘less’

AI continues to dominate industry headlines, in part because of its reputation for speed and ease. Today, consumers are leveraging it to solve a wide array of problems, and they’re becoming more creative in its application to everyday life.

But for all the conveniences AI offers, training computer-generated models is still strenuous. A lot of data is required, it calls for hours of human intervention, and it can be costly. This all happens behind the scenes, and most people don’t see the tremendous number of resources required to train data or compute it so that it’s ‘workable.’

What’s more, investment and resources aren’t as easily accessible or available for smaller companies. Larger-scale companies may have the funding necessary to invest, but smaller companies aren’t on par. For all the advantages AI offers to businesses, there remain some glaring limitations.

We’re excited to share that in my most recent paper that was accepted into the Computer Vision Pattern Recognition (CVPR) 2025 conference, we highlight a solution to the aforementioned challenges. Within this paper, “Simple Supervised and Semi-Supervised Long-Tailed Object Detection,” we relay how we’ve cracked the code on detecting rare objects with fewer training instances or leveraging less training data within a computer-generated model. This can help level the playing field between small and large companies and get innovative ideas to market more quickly, potentially sidestepping the need for additional funding.

Here’s how it works today. Imagine we want to train the model to recognize a cat. To train a capable AI system, a person would need to annotate millions of instances of the cat, labeling it until the machine “learns” the object. We’d ask AI to draw boxes around millions of cats and teach it that “this is the cat.” Then we’d re-teach it over and over again. It’s far from a ‘one and done’ exercise.

Now, instead of someone labeling millions of images of cats to train a model, we can use fewer images of cats. The system can look at an unlabeled image, that doesn’t have human annotations attached to it, and say, “I believe in this corner, there is a cat,” and it will label that cat in that corner by itself without human assistance. Resources are freed to do more meaningful work, and the model is still trained. How? By leveraging algorithms with a technique called pseudo-labeling.

In this technique, the algorithm pseudo labels the cat autonomously after a few rounds of human intervention, because the pseudo-label it created retrains itself in a self-training system. This is called semi-supervised learning, and while the training happens many times, it’s automated.

The “more mileage, less training data” is the idea, and it’s now a technology we’ve patented. The approach is groundbreaking and an entirely different way to approach the problem. It can span beyond insurance applications, such as our pioneering AI-powered property risk solutions, and it could make it cheaper, easier and faster to get accurate innovations out the door to benefit many markets.

When we were researching, we didn’t want to say, “We can train the AI model the same way it’s trained today, and we need the same resources to do it.” Our team wanted to arrive at a place where we achieved better results than others, as shown in our paper, but with less inputted, pre-labeled data and a more versatile approach that any sized company could emulate.

We’re excited by this breakthrough and acceptance into the CVPR 2025 conference. We look forward to bringing efficiencies like these to the insurance industry and to the rest of the market so we can work smarter, not harder, to achieve results.

Vu Tran joined LexisNexis® Risk Solutions in 2022 through the acquisition of LexisNexis®Flyreel®. Of the roughly 13,008 white papers submitted to CVPR 2025, his was one of 2,878 selected. This is the second CVPR publication from LexisNexis Risk Solutions. The first was submitted by Flyreel, which became a LexisNexis Risk Solutions company.

Business Overview

Executive Leadership

Careers

Corporate Responsibility

Inclusion & Belonging

Our Technology

News Room

Experts

The ADAM Program

Corporate Alliances and Partnerships

Government Alliances

Healthcare Partnerships

Insurance Alliances

Detecting Rare Objects with Fewer Training Instances

Vu Tran

AI enables doing ‘more’ with ‘less’

Solutions for Government

Markets We Serve

Markets We Serve

Capabilities & Solutions

Markets We Serve

Capabilities & Solutions

Solutions For Law Enforcement

Investigative Services

Industries

Business Overview

Executive Leadership

Careers

Corporate Responsibility

Inclusion & Belonging

Our Technology

News Room

Experts

The ADAM Program

Corporate Alliances and Partnerships

Government Alliances

Healthcare Partnerships

Insurance Alliances

Detecting Rare Objects with Fewer Training Instances

Vu Tran

Share

AI enables doing ‘more’ with ‘less’

Share