Convert HTML To Notion Blocks

// #html#markdown#notion // 3 comments

If you need to import HTML into Notion, you may have faced the challenge of converting HTML to Notion's block format. In my recent project, I stumbled upon this issue and was unable to find any existing package that provided a solution. That is why in this post I want to share the code and libraries that helped me make it work.

The first step is to parse the HTML code into a JavaScript object. To do this, we use the fromHtml function from the hast-util-from-html library, which takes an HTML string as input and returns a hast object:

import { fromHtml } from 'hast-util-from-html'; const html = '<h1>Hello <strong>world!</strong></h1>'; // HTML to HTML AST const hast = fromHtml(html, { fragment: true });

The hast object represents the HTML code as an abstract syntax tree (AST) that can be manipulated and transformed into other formats:

{ type: 'root', children: [ { type: 'element', tagName: 'h1', properties: {}, children: [ { type: 'text', value: 'Hello ', position: { start: { line: 1, column: 5, offset: 4 }, end: { line: 1, column: 11, offset: 10 } } }, { type: 'element', tagName: 'strong', properties: {}, children: [ { type: 'text', value: 'world!', position: { start: { line: 1, column: 19, offset: 18 }, end: { line: 1, column: 25, offset: 24 } } } ], position: { start: { line: 1, column: 11, offset: 10 }, end: { line: 1, column: 34, offset: 33 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } } ], data: { quirksMode: false }, position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } }

Once the HTML code is parsed into a hast object, the next step is to convert it to Markdown. We use the toMdast function from the hast-util-to-mdast library, which takes a hast object as input and returns a mdast object.

import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); // HTML AST to Markdown AST const mdast = toMdast(hast);

The mdast object represents the Markdown code as an AST, similar to the previous HTML AST:

{ type: 'root', children: [ { type: 'heading', depth: 1, children: [ { type: 'text', value: 'Hello ', position: { start: { line: 1, column: 5, offset: 4 }, end: { line: 1, column: 11, offset: 10 } } }, { type: 'strong', children: [ { type: 'text', value: 'world!', position: { start: { line: 1, column: 19, offset: 18 }, end: { line: 1, column: 25, offset: 24 } } } ], position: { start: { line: 1, column: 11, offset: 10 }, end: { line: 1, column: 34, offset: 33 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } }

The next step is to serialize Markdown AST into a Markdown string. We use the toMarkdown function from the mdast-util-to-markdown library, which takes a mdast object as input and returns Markdown string:

import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; import { toMarkdown } from 'mdast-util-to-markdown'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); const mdast = toMdast(hast); // Markdown AST to Markdown string const markdown = toMarkdown(mdast);

The Markdown string has the format we are used to from Editors like on Dev.to or GitHub:

# Hello **world!**

The final step is to convert the Markdown string into Notion block objects that can be used to create new blocks using the Notion API. We use the markdownToBlocks function from the @tryfabric/martian library, which takes the Markdown string as input and returns an array of BlockObjectRequest objects:

import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; import { toMarkdown } from 'mdast-util-to-markdown'; import { markdownToBlocks } from '@tryfabric/martian'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); const mdast = toMdast(hast); const markdown = toMarkdown(mdast); // Markdown string to Notion Blocks const blocks = markdownToBlocks(markdown);

Each BlockObjectRequest object represents a Notion block and contains the necessary information to create and format the block in your Notion page:

[ { object: 'block', type: 'heading_1', heading_1: { rich_text: [ { type: 'text', annotations: { bold: false, strikethrough: false, underline: false, italic: false, code: false, color: 'default' }, text: { content: 'Hello ', link: undefined } }, { type: 'text', annotations: { bold: true, strikethrough: false, underline: false, italic: false, code: false, color: 'default' }, text: { content: 'world!', link: undefined } } ] } } ]

Here is an example of how to create a new page in your Notion workspace with the blocks converted from the original HTML. We use the @notionhq/client library and you'll need to have a valid Notion API key and database ID:

import { Client } from '@notionhq/client'; // ... const blocks = markdownToBlocks(markdown); const notion = new Client({ auth: process.env.NOTION_API_KEY }); const properties = { Name: { title: [{ text: { content: 'New Page' } }] } }; const response = await notion.pages.create({ parent: { database_id: process.env.NOTION_DATABASE_ID }, properties, children: blocks, });

The response from Notion contains the new page and its properties. However, it does not contain the blocks, since these must be queried separately.

{ object: 'page', id: 'c7d50cb4-0518-4ef9-8457-056a53fd53eb', created_time: '2023-03-01T10:28:00.000Z', last_edited_time: '2023-03-01T10:28:00.000Z', created_by: { object: 'user', id: '9d72c6df-be87-4bb5-8edb-1c19461d01bb' }, last_edited_by: { object: 'user', id: '9d72c6df-be87-4bb5-8edb-1c19461d01bb' }, cover: null, icon: null, parent: { type: 'database_id', database_id: '95970798-23cc-4f7c-a0c5-4870b80fcdee' }, archived: false, properties: { Name: { id: 'title', type: 'title', title: [ { type: 'text', text: { content: 'New Page', link: null }, annotations: { bold: false, italic: false, strikethrough: false, underline: false, code: false, color: 'default' }, plain_text: 'New Page', href: null } ] }, // ... }, url: '<https://www.notion.so/New-Page-c7d50cb405184ef98457056a53fd53eb>' }