The CPN project is finished!

We have submitted our final deliverable, and are winding down the project activities. Are you curious about the next steps with the CPN tools? Read on.

After two and half years of researching, user testing and developing our content personalisation platform, the CPN project has now come to a close. Thank you to everyone who followed our journey!

Are you interested in exploring our toolkit for content personalisation further? Send an email to the consortium partner Engineering to get more information on how to start using the CPN platform with your own content.


CPN Factsheet

Screen Shot 2020-07-08 at 09.38.39.png

Webinar takeaways: Trust, transparency and personalisation

CPN Webinar blog image.png

A panel of media professionals discussed the various methods for news content personalisation, what their media organisations have achieved with personalisation, and the common challenges related to algorithmic journalism.

To mark the end of the CPN project, we organised a special webinar on how content personalisation can help news publishers create stronger, more engaged relationships with their audience members.

The panellists discussing the theme were:

  • Swantje Fischenbeck, Innovation Manager at Der Spiegel in Germany

  • Jarno M. Koponen, Head of AI & Personalization at Yle News Lab in Finland

  • Gordon Edall, head of labs at the Globe and Mail in Canada

  • Ine van Zeeland, PHD Researcher on Privacy at imec-SMIT, Vrije Universiteit Brussel in Belgium

The webinar also included presentations from Al Ramich, Founder and CEO of Loomi.AI, and Mattia Fosci, CEO of ID-Ward. The two startups have worked with CPN to develop innovative technical features that support content personalisation – read more about them after the article.

A video recording of the webinar is available below.

Personalisation needs to be combined with testing

As a news publisher, Der Spiegel is at the early stages with its personalisation journey, Fischenbeck said, having started looking into the topic a few months ago. The magazine is investigating questions such as: 1) Can personalisation generate business value while respecting Der Spiegel’s mission as a news publisher? 2) What technical possibilities exist for personalisation, and how can they help in competing for users’ attention? 3) What type of personalisation should Der Spiegel should adopt, and what variables should it be based on: content, time, mood, context, etc.

Fischenbeck emphasised that when determining the right personalisation approach for your publication, it is important to know your content and audience: “We realised that personalisation doesn’t work if you don’t match your solution with your context, company and users.”

At the other end of the personalisation spectrum is the Globe and Mail, which started working on personalised content seven years ago. During this time, the publisher has built a rich recommendation architecture that uses a number of modules to produce recommendations under different conditions.

One particularly successful personalisation feature that has been “In case you missed it”, which highlights articles that the user probably didn’t see but is likely to be interested in. But that was a fairly simple solution to construct, Edall said, adding that many questions related to personalisation are much harder to answer, such as how to make recommendations when you don’t yet know anything about a user – the so-called “cold start” problem.

“As you layer personalisation technologies in, you should also be layering in test-and-learn systems, and learn whether your recommendation engines are working,” Edall said. “It’s not enough to do personalisation, you also have to do testing. It is hard to make salient recommendations, and the only way to know if you’re making salient recommendations is to figure out how to test those recommendations.”

Testing also helps in gathering the kind of proof that may be useful in internal conversations. “As you go to higher degrees of personalisation, it becomes harder and harder for editors to trust it,” Edall said “So learn to talk to the newsroom through data, testing, through evidence.”

A change of mindset

As publishers acquire more and more data about their readers in order to offer personalised content, users are understandably asking how their data is being used. To counter any fears, van Zeeland encouraged publishers to be transparent about the collection and use of user data.

“It’s very important to check what your users want and expect from you. Any type of personalisation would require that,” she said. “It’s important to regularly verify with users whether what you are doing is what they want and need, and whether they understand what you are doing.”

Fischenbeck pointed out that audiences may have reservations when it comes to personalisation of news, but in other context they have embraced the concept: “People are totally into it when it comes to Netflix or Spotify.”

For a publisher, personalisation can also represent a chance to rethink its relationship with its readers. “Instead of just sending stuff out to the world, it’s an opportunity to become more interactive and really engage with our users,” Fischenbeck said. “That’s an important goal for us, and is closely related to the importance of subscriptions these days.”

Also Yle puts the focus on the user when designing personalisation systems, Koponen said. “Our approach has been, we let the person choose. They can have the total vanilla experience, with no personalised content. Or they can have different levels of personalisation,” he said. “It’s very important the user feels in control.”

Koponen highlighted Yle’s news personalisation feature Voitto, a news assistant that shows recommendations directly on the user’s lock screen. Built by a multidisciplinary team, Voitto learns from the user based on both direct feedback (input from user: “more of this” or “less of this”) and indirect feedback (user behaviour). “We had very good results: of the people who opted in to receive personalised news notifications, 90% decided to keep them on.“

Covid-19 and personalisation

At the Globe and Mail, a big driver behind personalisation efforts is the aim to make sure that the right content finds the right audience, Edall said. “It doesn’t help us to write or spend time on a story that's going to matter to someone if that someone can't find it. That is a real problem with the kind of degree and amount of information that anyone has to process in any given day.”

“How much personalisation is needed to augment the wider editorial mandate that we're trying to establish and drive, is one of those things we talk about all the time,” he said. “We usually try to bake personalisation into widgets that are pretty clearly personalised, and where the nature of that widget is clear.”

The ongoing Covid-19 crisis has provided publishers with an opportunity to learn about news personalisation in these exceptional circumstances, where the great majority of news coverage relates to a single issue. Edall said that the coverage of the pandemic, or health information generally, is not a case where the Globe and Mail uses personalisation. But what often happens is, news about Covid-19 is what draws people in, and then many look for other types of news – and that’s where personalisation can play a role.

“Some people really want to move from the serious Covid stuff to much lighter stuff, horoscopes, entertainment articles. Then there's another subset of users who really prefer to be directed to political, business, or economic coverage that's just tangentially related to the overriding issues that are being caused by COVID,” he said. “You can simply give people better options earlier through personalisation that branches out from that dominant focal area for coverage. For me, that's the most interesting way to use personalisation right now.“

You can engineer your way out of the filter bubble

Koponen underlined the need for editorial judgement to coexist with algorithms: “From the very beginning we thought that our journalistic values need to be encoded into the algorithms,” he said. “The journalists in the newsroom need to be able to override personalised recommendations, if there is something very important happening.”

Yle also excludes some content from personalised recommendations. “If something horrible has happened, like a family has died, we don’t want a person to encounter that in our personalised ‘For You’ experience.”

Finally, Edall discussed the much-feared filter bubbles, and whether personalisation can trap readers in echo chambers. “Is there a real danger that wide-scale deployment of personalisation by a general interest news publisher, say a generic national title, could lead into a filter bubble? My answer is unequivocally no – if the personalisation system is designed well.”

“Personalisation systems don’t always have to cater to things that are closest to the core of the things that you have read. We have a large number of different recommendation modules, and some of those recommendation modules are actually explorers that are designed to avoid filter bubbles,” he said.

In other words, algorithms can be engineered in a way that safeguards against the creation of filter bubbles. “I would say, the only way you have recommendation systems that lead to filter bubbles is if you have irresponsible engineers who design the systems that create them.”


Startups

Loomi

Loomi is specialised on platforms that enable people to create personalised AI assistance. The London-based startup has developed an extensive knowledge database, which allows it to improve metadata that makes better personalisation possible.

“If you’re doing content personalisation for thousands of users and with thousands of pieces of content, you need to automate the process. And for that you ultimately need a reference database of knowledge,” Ramich said.

Ramich said that there are different levels of contextual personalisation, ranging from manual tagging of content to AI and NLP-powered automation that extracts and tags entities.

“Ultimately what we do is create a knowledge graph that corresponds with a piece of news. Then, we can do the same on the user side,” he explains. ”Then we combine those knowledge graphs, which lets us extract insights to allow personalisation.”

Click here to access Loomi’s presentation.

ID-Ward

ID-Ward is a UK-based data and AI compliance company that offers an innovative solution for collecting data about users while protecting their privacy.

“Privacy is important not only because we're afraid of the GDPR stick, but also because 3rd party cookies are coming to an end as a technology used to identify and track users,” said Fosci.

The company breaks the issue into three components:

  1. Identifying visitors: The firm’s one-click login makes it easy to authenticate users without the need to install 3rd party cookies.

  2. Tracking data across domains and devices: The company’s infrastructure pools data across the domains that use its login system, making it possible to track a single user on different sites.

  3. Privacy protection: In addition to anonymising user data, the company is working on a system that uses the “federated learning” method to personalise content on the device – meaning that the personal data is never shared with the cloud.

“What's happening is the data is shared with the user data account, so the data is owned and controlled by the user,” Fosci said. ”What the publishers have access to is anonymised data about the segment the user belongs to.”

Click here to access ID-Ward’s presentation.


Pilot 3: what we learned from the last round of user testing

As part of the third pilot, news consumers tested the CPN app for a period of four weeks, providing us with valuable feedback on the performance of our recommender.

During the third pilot, which took place in January and February this year, the news audiences of VRT, DIAS and Deutsche Welle, the three media partners in the project, were able to test the CPN recommender by accessing the publishers’ news content through the CPN software.

Through monitoring the usage of the app, we were able to measure user engagement, article diversity and viewing of long-tail articles. Furthermore, test users provided us feedback through surveys, which allowed us to measure feelings of missing out / being informed, and much more. 

The third pilot concluded the user testing part of the CPN project. (You can read more about the previous rounds here: First Pilot, Second Pilot.)

The CPN recommender with Deutsche Welle content

The CPN recommender with Deutsche Welle content

Key findings

Analysing the data and feedback collected from the test users, we have identified four major learnings from the pilot.

VRT MYNWS.png
  • Different recommendation engines have different effects. We noticed that content-based approaches provided more engagement. In other words, we saw longer read times and higher scroll depth when readers were presented with articles that were recommended to them based on content. Moreover, long-tail articles, which typically get less views, were read more often thanks to content-based recommendation. 

  • Personalisation did not lead to a perceived filter bubble. News professionals and audiences sometimes fear that personalisation can put them in a filter bubble (something we discussed in depth here). However, in our evaluation, we saw no difference among users in the control group vs. the personalised group, as both groups reported similar experiences. 

  • Different recommenders provide different results. The hybrid recommender provided very good results in terms of article diversity, whereas the content-based recommender decreased diversity. (The hybrid recommender uses a mix of different techniques in order to overcome the weaknesses of a single recommender system, whereas the content-based recommender provides recommendations based purely on the consumed content. Read more about the CPN recommender engine here.)

  • Feeling of informedness. With regard to feeling more informed, there was no difference between the two groups, but both the personalised and the control group felt equally informed.

Interested to know more?

The full report about the pilot process and findings will be made available on the website here once ready. For any questions, you can contact us here.

Are filter bubbles really to blame for social and political polarisation?

lanju-fotografie-muy0ywmdsPY-unsplash.jpg

Many media professionals and news consumers fear that algorithmic personalisation can end up trapping readers in so-called filter bubbles. However, our review of the related research shows that concerns over the issue may be exaggerated.

Since the internet activist and publisher Eli Pariser popularized the concept of the digital filter bubble in 2011, many people have become suspicious of the growing influence of algorithms.

Although algorithmic curation has largely contributed to the success of YouTube, Facebook, Amazon, Twitter and beyond, voices including the former US president Barak Obama and German President Steinmeier have expressed concern over the negative effects this technology can have on societies.

The common fear is that recommendation engines give users similar content in continuous feeds, thus creating ‘echo chambers’ that amplify the already powerful confirmation bias and block out other perspectives that could challenge the users’ opinions or at least put them into perspective.

The research lead during the CPN project, in the form of user surveys and expert interviews, showed that many news professionals and readers are also concerned about the concept of personalisation, suspicious of its potential to create ideological filter bubbles.

The echo chamber thesis seems largely founded on the inherent tribalism on social media. Social networks not only let us stay connected with old classmates and far-away relatives, but also create communities of likeminded people – for the better or worse. Moreover, social media companies are still struggling to respond to the spread of false information, extremist content and outright hate-speech on their platforms.

Segregation vs. viewpoint diversity

Given the widespread concerns, there is a surprisingly small number of scientific studies available about filter bubbles, but the research that currently exists draws a far more nuanced picture of the issue than many might imagine.

Many fear that audiences divide because of entirely diverging news media diets, based on political allegiances, with the United States and its infamous blue and red news feeds often mentioned as an example of this. Still, recent research shows that only a small portion of the American population is trapped in ideological chambers created by partisan media, with some reporting the number at as low as 8%.

In other western countries, media consumption is even less segregated. In Germany, the different political figures and their supporters predominantly consume and share content from the same few mainstream sources. Even in the Pre-Brexit UK a rather small number of people were at risk of being caught up in informational filter bubbles.

While many of today’s social and political tendencies that affect online discourse are alarming, social media sites can still act as spaces in which people encounter the most diverse information sources, and get exposed to viewpoints they don’t agree with, instead of being trapped in cosy ideological and cultural bubbles of real world social interactions. Age also seems to be a factor: older populations show the highest amount of political polarisation despite the low level of social media use in that age group.

Even members of extremist groups do not exist in hermetically enclosed informational bubbles, but on the contrary acknowledge that other opinions exist, and can take a certain pride in ignoring and contesting mainstream beliefs.

Product mix effect 

Beyond social media, a study from 2018 questions the supposedly polarising effects of algorithmic filtering of the Google News feed: the search results for news in the sample group were relatively similar even if certain news publications have a significantly greater presence than others.

As for news applications, other research has shown that users of personalised news apps view higher number of sources and news categories, without a change in reported partisan content.

One important factor is serendipity, also known as the ‘product mix effect’, which keeps a consumer interested in the offered content while allowing the algorithm to evolve. First devised in retail, the product mix effect may influence our media diet just as it influences our shopping behaviour.

In a recent study a researcher team from the University of Amsterdam tested the effects of content personalisation within a single-source content setting. The results indicated that when applied to mainstream journalistic content, all common algorithmic filtering approaches resulted in recommendations that do not differ from human editorial recommendations in terms of diversity.

The study focused on diversity in terms of topics as opposed to diversity in terms of ideological viewpoints. This is a novel approach to diversity representation that might be more suitable for the European context with its highly pluralistic political and media landscapes.

Algorithmic transparency

While information diversity is certainly important for any society, the availability of counter perspectives in news and opinion pieces alone may not be the universal antidote to social fragmentation and political polarisation. A number of studies suggest that exposure to opposing views might, in fact, increase political polarisation.

Overall, it is too early to assess the exact impact technology has on political opinion building: the era of algorithmic filtering is still in its beginning stages, and much more research is needed to understand how content personalisation impacts societies’ cultural, geographic and socio-cultural patterns. But as application developers and media professionals, we nevertheless need to take seriously any potential risks related to new technologies, and strive to educate and empower our readers to become proactive and critical news consumers.

Similarly, the way we write about new technologies should reflect their realistic abilities and limitations. Neither an algorithm nor an editor will ever be able to include all perspectives into a curated media feed. On the other hand, we also need to create the best user experience to compete with popular applications. While social media sites may have gained more popularity as the go-to news sources, the amount of real news on those platforms tends to be very scarce.

The number of approaches that aim to provide diversity or at least transparency to news recommendation applications is fortunately growing – see for example BBC’s Public Algorithm, Diversity as a design and public service algorithms designs. For now, we may have more questions than answers about the relationship between algorithms and user behaviour, but such resources can point us in the right direction.

 

Photo by Lanju Fotografie on Unsplash

 

CPN News Recommender Engine: how our personalisation solution works

Instead of using a single algorithm, the CPN platform takes advantage of a variety of techniques to produce personalised recommendations. The platform’s A/B testing module allows publishers to test and optimise the used configurations.

The News Recommender we have developed for CPN is a hybrid engine, i.e., it doesn’t rely on a single technique or algorithm to feed recommendations to users, but uses a mix of different techniques in order to overcome the weaknesses of each approach and to offer recommendations based on different “points of view”.

In the CPN platform, publishers can create, via API calls, different kind of Recommenders:

  1. Content-based recommendations: based on unsupervised keyword extraction and named entity extraction, semantic uplifting techniques, etc.

  2. Collaborative filtering techniques: assessing users’ consumption history similarities

  3. Most Popular recommendations: recommendations based on trending items

  4. Random recommendations: some random items to add variety to the recommendations

  5. Composite recommender: quota-based combinations of the above techniques

  6. Sentiment-based recommender: a recommender that takes into account automatic sentiment analysis of the news content (is the news uplifting or depressing?)

Figure 1: CPN Recommender Architecture

Figure 1: CPN Recommender Architecture

The architecture of the recommender is depicted in the above picture. News is continuously ingested from publisher feeds (rss, xml, etc.) and processed to extract useful information from them (entity extraction, unsupervised topic extraction, etc.) The enriched items are stored into a NoSQL store on the CPN platform. The recommenders compute what are the most interesting articles for a specific user according to a user profile that is constantly being refined by collecting events generated by users, such as user-clicks, reading-time, explicit likes/dislikes and so on.

The advantage of the system is that it is very flexible: a high number of configurations and customisations are possible. It is unlikely that every publisher will be using the same combination of techniques and, probably, the configurations will need to be fine-tuned over time. For example one publisher can privilege content-based recommendation over other techniques according to their own specific business needs or measured impact on the users.

In order to help publishers to find the most suitable configuration to their needs we have developed the Recommender A/B testing module.

Figure 2: Recommender A/B testing

Figure 2: Recommender A/B testing

In figure 2 a high-level overview of the module is presented; the users are partitioned into groups, and every user is associated to a specific recommendation technique (or a specific hybrid combination). Additionally the publisher can also define what subset of news items must be fed to a specific group (for example, deciding what kind of content has to be delivered to users based on age groups).

The module is tightly integrated with the CPN Recommender module and exposes an API that allows every publisher to create recommenders and user groups. It is a “configuration” module, and recommenders and groups can be created and modified at any moment by the publisher or administrators of the platform. User groups are created by specifying a name for the group and then, by calling a specific api, users are added to groups by specifying their user IDs.

By using such a module, a publisher is able to test different versions of a recommender at the same time on different user groups in order to compare the behaviour of different approaches to content personalisation. Every publisher is able to create an unlimited number of groups and recommenders and to associate a specific recommender to a group.

A/B testing can produce concrete evidence of what actually works in personalisation. The module can be used for continuously testing new techniques and approaches in order to optimise conversion rates and gain a better understanding of customers.

Furthermore, the number of groups that can be tested is not restricted to only two (A/B). The module allows for unlimited number of recommenders to be tested concurrently in large scale experimentation campaigns.

Recommender A/B Testing: usage

The modules (via a REST api) allows every publisher to define:

  • Unlimited number of recommenders

  • Unlimited number of groups

  • Associating a specific recommender to a specific group

The structure of the module makes it very suitable for use especially in production environments. It makes it possible to continuously test new techniques and approaches in order to optimise the desired metrics or to test configuration changes of an existing recommender, and apply them only to a restricted number of users before releasing to the whole user-base.