Using AI to Improve Categorisation and Reduce Noise in Social Data

As the data landscape continues to grow exponentially (↑ 23.13% in 2025 vs 2024), the harder it becomes to surface meaningful insights in a sea of noise.  

A contributor to this challenge is Generative AI. In the past 8 years, AI has been growing in popularity, with 78% of businesses in July 2024 using AI for at least 1 business function, vs just 20% in 2017 according to McKinsey & Company.  With tools such as Chat GPT being openly available to all internet users, individuals and organisations can now create and publish content in a few minutes as opposed to a few hours, breaking down barriers to scale content production and contributing to the growing information landscape.  

Inaccuracy is another challenge posed by Gen AI, with many users thinking all information generated by AI to be true, when this is not always the case. The output of Gen AI is only as good as the sources that it’s pulled the information from, meaning if the content generated is cited from an untrustworthy source and not fact-checked before publishing, misinformation begins to spread.  And despite this risk of inaccuracy, a survey by McKinsey  & Company identified that only 27% of employees review all content created by gen AI before it is used. 

To meet the growing challenge, AI is being utilised to improve categorisation and reduce noise in social data, something that we’re working with our clients to implement at KINSHIP Digital.  

Common use cases we've worked with clients on are: 

  • Spam Identification- Spam accounts often exhibit common characteristics, such as language patterns, account names, and profile pictures. A trained model can enable automatic detection spam messages and distinguishes spam accounts from genuine interactions. It can categorise spam messages based on pre-defined criteria, such as:  

  • Random URLs – Messages containing suspicious or unsolicited links. 

  • Indeterminable Foreign Language – Content written in obscure or nonsensical text. 

  • External Advertising – Unauthorised promotions or marketing messages.  

  • Custom Sentiment model- Training a sentiment analysis model improves accuracy by accounting for linguistic nuances, such as sarcasm, context, and industry-specific language. This approach achieves higher precision in identifying true sentiment. 

Additionally, the model can be trained on customised emotion categories that align with a client’s specific reporting requirements. This enables organisations to move beyond basic positive, negative, and neutral classifications, providing deeper insights into public sentiment, engagement, and emerging trends. 

  • Influencer discovery/brand advocacy- Support with categorising messages into various personalised classifications such as: 

  • Promoter Messages – Messages that actively support, endorse, or positively engage with the client. 

  • Passive Messages – Neutral or informational messages that neither strongly promote nor detract from the client. 

  • Detractor Messages – Messages that criticise, challenge, or negatively impact the client’s reputation. 

  • This classification allows organisations to monitor brand perception, identify emerging trends, and develop targeted engagement strategies to enhance public sentiment and trust. 

  • Entity Disambiguation- Our clients have trained models to differentiate between entities and keywords with multiple meanings, ensuring accurate sentiment analysis and contextual understanding. For example, 

  • Example 1: “Tropical Cyclone Alfred destroyed our backyard, but emergency crews reached us in time and kept my family safe”.   

Out-of-the-box sentiment analysis may flag this post as negative due to keywords like “destroyed,” overlooking the praise for emergency response, which is a crucial data point for crisis response evaluations. AI Studio can be trained to distinguish between entities and context, enabling crisis intelligence teams to detect the positive public sentiment toward first responders, even within messages that carry an overall negative tone. 

  • Example 2: AI model can distinguish between Apple the technology brand and apple the fruit, eliminating ambiguity for more precise insights. 

As the speed and volume of data continue to climb, the challenge isn't just about gathering information, it's about making sense of it. While Generative AI has contributed to the growth of the information landscape, it also offers a solution to help us cut through the noise. At KINSHIP Digital, we’re helping our clients to better filter spam, refine sentiment analysis, and disambiguate online conversations by implementing and training AI models for them. AI is a good tool, but it can be made great with the right training and management. After all, garbage in, garbage out as the saying goes.   

If you're using Sprinklr and would like to explore how any of these capabilities can be integrated into your current setup, your KINSHIP Digital consultant can help assess the best fit and get you started. 

Next
Next

6 Questions to Ask Yourself Before Removing X From Your Channel Strategy.