Skip to content

Conversation

@manasiSantFT
Copy link
Contributor

@manasiSantFT manasiSantFT commented Jul 22, 2025

Description

This PR introduces a temporary solution to support Flourish graphics in articles by ensuring they are included in the content.embeds array, even when not embedded.

Ideally, Flourish graphics should be parsed and included in the content.embeds section. Currently, they are not, which affects downstream rendering. This change bridges that gap until a more robust solution is implemented.

Code changes

  1. server/lib/enrich/article.js
  • Parses and extracts Flourish graphics from the HTML body.
  • Constructs a fallback image URL for the Flourish content.
  • Creates a properly structured embed object and appends it to the content.embeds array.
  • Adds corresponding unit tests for this logic.
  1. server/lib/builders/document-builder.js
  • Updates removeNonSyndicatableImages to preserve Flourish content and prevent it from being removed inadvertently.
  • Includes unit tests to verify the updated logic.

Ticket

https://financialtimes.atlassian.net/browse/LIF-612

@manasiSantFT manasiSantFT changed the title Lif 612/flourish content LIF 612/flourish content Jul 22, 2025
@next-team next-team temporarily deployed to ft-next-synd-lif-612-fl-8ipdfj July 22, 2025 12:32 Inactive
@manasiSantFT manasiSantFT changed the title LIF 612/flourish content LIF 612 - Add support for Flourish graphics Jul 22, 2025


if(isFlourishElement) {
const match = elementSrc.match(/\/visualisation\/(\d+)\//);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: What is the value of elementSrc?

Instead of Regex, we can use the built-in URL parser and then apply a focused regex only to the pathname. Regexes are fragile in case of matching with URL paths, as they will not cover the URL variations:

// Parse the URL
const urlObj = new URL(elementSrc);

let isFlourishElement = false;
// identify flourish element
const elementSrc = el.getAttribute('src');
if(elementSrc?.includes('public.flourish.studio/')) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment/suggestion: I'll consider a more functional approach instead of imperative style when making changes in the DOM. We could simplify this to something like:

// Determine if it is a Flourish Element
const isFlourishElement = elementSrc?.includes('public.flourish.studio/');
// Determine image type
const imageType = isFlourishElement
? 'graphic'
: el.getAttribute('data-image-type');

This reduces multiple declarations and less to maintain.

const RE_BAD_CHARS = /[^A-Za-z0-9_]/gm;
const RE_SPACE = /\s/gm;

function extractFourishEmbeds(contentHTMLBody) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: extractFlourishEmbeds

const flourishEmbeds = [];
let match;

while ((match = flourishIdRegex.exec(contentHTMLBody)) !== null) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: Regex.exec has a state inside it, explained here. Maybe we can use a simpler approach which doesn't involve any states.

Example:

for (const match of contentHTMLBody.matchAll(regex)) {
  // match[1] will be captured ID I believe
  ids2.push(match[1]);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants