Synthetic Users in Research: Clients, beware the s-AI-lesman

What we’re saying is synthetic user data is here to streamline your research, but only if used wisely.

 

Australia beware! There’s been a rise of opportunistic research companies looking to cash in on the AI gold rush, pedalling synthetic user data. If it sounds too good to be true, it probably is.

Using synthetic users means leveraging AI to create models that simulate actual user behaviour — an approach gaining momentum in academia and business industries. 

Whether this data enhances or replaces real human insights remains a contentious issue in design and marketing strategies. However, its use is inevitable and should be embraced as supplementary to primary research.

 
 

In the elevator to your office (I have 30 seconds)

Before you start your work day, understand the potential benefits and concerns of using synthetic ‘customer’ data.

The good:

  • Synthetic users are a more cost-effective alternative to engaging with real humans.

  • They provide a scalable, faster way to test solutions. In particular, those nearing the end of the design process.

  • They’ve already been applied successfully in medicine, banking, and government services.

The concerning: 

  • Critics warn of the risk of losing human authenticity and overlooking the complex, emotional nuances that only real users can provide. While synthetic users come close, they cannot yet fully replicate the depth of empathy and insight gained from interacting with real people.

  • Whilst insights are low-cost and scalable in nature, many research companies are taking the opportunity to set premium prices for AI data. You should be cautious of partners offering quick, too-good-to-be-true solutions that may prioritise profit over quality research.

  • Synthetic customer data is generated from LLMs like OpenAI and can be replicated with a bit of prompt engineering via an Enterprise ChatGPT licence. This can save your organisation the investment in a third-party.

How are organisations currently using synthetic data?

Read on to find out.

 

Between workouts (I have 2 minutes)

Before moving from the exercise ball to the dumbbells, understand how synthetic users are currently being applied.

Advocates emphasise their ability to quickly generate diverse data sets, helping businesses test with broader scenarios when data is scarce. For instance, A Google-sponsored initiative addressed data scarcity in certain skin conditions by developing a model that generates synthetic medical images that exhibit the desired skin characteristics.

Many see synthetic users as an opportunity to scale research processes quickly and affordably, as mentioned in EPAM’s exploration of synthetic data’s role in User Experience (UX) research. These AI-driven personas can test multiple scenarios across demographics, time zones, and unique user journeys without the logistical headaches of organising large focus groups. For businesses facing time and budget constraints, synthetic users can be a game-changer.

Synthetic users have been used to provide a controlled setting for testing before real-world application, as seen in CARLA’s use in autonomous vehicle testing. Such versatility helps businesses stress-test their products without the cost of real-world testing.

From a marketing perspective, some leaders have suggested that synthetic users are just the start; and that synthetic data strategies, through providing more flexibility and reduced costs, will be the future.

However, some critics say traditional research captures the emotional and contextual factors of real users, which AI-generated personas might overlook. Nielsen Norman Group highlights that synthetic users often lack the unpredictability and depth of real human interaction, which are critical in user experience (UX) testing. While these AI-driven simulations can be helpful in stress-testing systems and finding technical bugs, some fear they may fall short when it comes to understanding genuine human reactions.

For instance, the use of synthetic users has been found to lead to biased decision-making. For example, some banks using synthetic data for credit scoring and mortgage analytics discovered that the generated data did not fully capture the complexities of real-world customer behaviour, particularly for underserved groups such as individuals with non-traditional credit histories. This not only affected the models' accuracy but also excluded certain customers.

Keep reading for our recommendations on how your organisation could use synthetic users.

 

On the walk to your car (I have 5 minutes)

Before embracing synthetic data, go through these steps to assess the viability of usage for your organisation.

1 — Recognise that real user research is still essential for your organisation. 

As mentioned, synthetic users cannot replace the depth and empathy gained from studying and speaking with real people. They often provide shallow or overly favourable feedback.

As pointed out by user experience experts, synthetic users often don’t account for the complex emotions, cultural context, and subtleties that real users bring to the table. Human emotions like frustration, joy, or uncertainty can dramatically impact how a product is perceived. An AI-generated user may not fully capture these nuances, making it difficult to design products that resonate emotionally with your customers.

In our experience, some of the most valuable insights in primary research come from what isn't explicitly said.

Humans are uniquely skilled at extrapolating beyond data points, drawing on creativity, intuition, empathy, and social cues. These abilities help us uncover the subtle nuances of a user’s context, shaping how they perceive and interact with a product or service. Often, these nuances — gathered through one-on-one conversations — will spark fresh, strategic insights for your organisation.

Additionally, there are concerns about ethical transparency.

If your organisation starts making decisions based on synthetic data, you should be transparent with stakeholders and customers about your research methods. This concern grows even more pronounced when discussing sensitive industries like healthcare, where the human aspect is essential. 

We strongly recommend taking into account the context of your organisation or industry when it comes to the ethical use of synthetic users in research. If you’re a small business owner selling predominantly low-cost FMCG products, this would have vastly different use cases to a Global Health Insurance provider.

2 — Use synthetic research for specific purposes.

These AI-generated personas can provide consistent, scalable data for research. 

For example, they can simulate edge cases — uncommon but critical use scenarios — making them invaluable for testing digital interfaces under extreme conditions.

This makes them useful for desk research and generating hypotheses but not for your organisation’s final decision-making.

3 — Use a hybrid approach, which uses synthetic research as a supplement to real research, not a substitute.

We recommend using synthetic users to augment real-world testing rather than to replace it. This method will provide your organisation with the speed and scale of AI while retaining the authenticity of human feedback. In this way, synthetic users may become a valuable tool — but only when used in tandem with human insights.

While we're always open to adopting more efficient solutions for our clients, the future of research likely rests in this blend of synthetic and real user data. Together, they can offer the best of both worlds for marketing, digital, and experience design strategies.

 
Previous
Previous

AI’s Silent Threat: The Dangers of Letting Employees Use AI Without Boundaries

Next
Next

AI and ethics: Do more than the right thing.