Smart Data & Analytics

How to deal with unstructured data

The “briefing questions for unstructured data” were published by ESOMAR recently, and not a moment too soon. This comprehensive work includes 26 questions and has taken a year to complete, from inception to publication.

In general, there seems to be a lot of perceived complexity in using artificial intelligence or other engineered approaches to analyse text and images in an automated way. The ESOMAR briefing aims to both simplify matters, and offer the knowledge needed to understand what may seem complex and difficult at a first glance.

There are five sections in the document that are meant to guide buyers of related tools and services to ask vendors the right questions, in order to take an informed purchase decision. Even though ESOMAR mainly caters to the market researchers in organisations globally, there are many more users of text and image analytics solutions sitting in different departments – see further down, that can benefit from this briefing.

Section 1 – Company Profile and Capabilities

First of all it is important to know who we are dealing with. Is this a pure technology company with a tool or do they have any subject matter expertise? For example, market research and insights expertise would be nice if the buyer is an insights professional.

Section 2 – Data sources and types

Is this solution making use of specific data sources that it provides as part of the service or is it just an analytics solution – meaning the client should provide the data for the analysis. Even if the company provides data, is the technology source agnostic? In other words, can it process and accurately annotate data from social media, other public websites, answers to open ended questions, transcripts of in depth interviews and focus group discussions, call centre conversations, and organic conversations on private online communities?

Section 3 – Software design and capabilities

This section is one of the two most important ones. It helps the buyer understand how the data processing, annotation and analysis is done; in which languages and what types of data are analysed.

Section 4 – Data quality and validation

This is the other one of the two most important sections in the briefing. We all know the saying: garbage in – garbage out. This is about cleaning the data before processing and annotating them.

Section 5 – Ethical and legal compliance

The ESOMAR code of conduct has always been stricter than the law and this briefing is no different. Not only should the vendor be GDPR compliant, but they should also ensure that no harm is done to subjects in the research no matter how insignificant it may seem.

For some questions there are no right or wrong answers; the vendor just needs to have a plausible answer – if they do not, then that in itself would constitute a red flag.  A few examples of users and use cases that can benefit from asking the 26 questions:

  1. Market research – for insights from social and other unstructured data sources
  2. Public relations – to manage brand and corporate reputation
  3. Customer service – to respond to questions, complaints and requests
  4. Advertising – to leverage positive testimonials
  5. Marketing – to find and leverage influencers
  6. Product Development – to learn about missing product features or ones that are not appreciated by consumers
  7. Innovation (beyond new product development) – to learn about emerging trends and new product use cases
  8. Competitive Intelligence – to gauge how competitors are doing in an industry or product category
  9. Operations – to learn about issues that need fixing
  10. Finance (together with marketing) – to find out about sentiment towards pricing
  11. Board – to benchmark and track sentiment on governance
  12. Sales – to find sales leads who express purchase intent

Consumer research has typically been performed by asking questions in surveys or qualitative research. For many insights professionals, social media intelligence or intelligence extracted from other unstructured data sources, is fairly new.  If this guide is your first exposure to Natural Language Processing or image analytics then it is possible that some of the questions or explanations for context will not be enough to get a thorough understanding of the issue the guide is trying to address. In such a case feel free to contact ESOMAR or the project team co-chairs directly with your questions. 

If it turns out that we will need to create answers to frequently asked questions about the 26 questions and their possible answers, then this may imply that we did not do such a good job simplifying for our audience. The only consolation is that even if it contains a lot of complexity, it is a step in the right direction. Thank you ESOMAR for being open, flexible and very supportive to this initiative.

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.
Please note that your e-mail address will not be publicly displayed.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles