Convert HTML To Notion Blocks
If you need to import HTML into Notion, you may have faced the challenge of converting HTML to Notion's block format. In my recent project, I stumbled upon this issue and was unable to find any existing package that provided a solution. That is why in this post I want to share the code and libraries that helped me make it work.
The first step is to parse the HTML code into a JavaScript object. To do this, we use the fromHtml
function from the hast-util-from-html
library, which takes an HTML string as input and returns a hast
object:
import { fromHtml } from 'hast-util-from-html'; const html = '<h1>Hello <strong>world!</strong></h1>'; // HTML to HTML AST const hast = fromHtml(html, { fragment: true });
The hast
object represents the HTML code as an abstract syntax tree (AST) that can be manipulated and transformed into other formats:
{ type: 'root', children: [ { type: 'element', tagName: 'h1', properties: {}, children: [ { type: 'text', value: 'Hello ', position: { start: { line: 1, column: 5, offset: 4 }, end: { line: 1, column: 11, offset: 10 } } }, { type: 'element', tagName: 'strong', properties: {}, children: [ { type: 'text', value: 'world!', position: { start: { line: 1, column: 19, offset: 18 }, end: { line: 1, column: 25, offset: 24 } } } ], position: { start: { line: 1, column: 11, offset: 10 }, end: { line: 1, column: 34, offset: 33 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } } ], data: { quirksMode: false }, position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } }
Once the HTML code is parsed into a hast
object, the next step is to convert it to Markdown. We use the toMdast
function from the hast-util-to-mdast
library, which takes a hast
object as input and returns a mdast
object.
import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); // HTML AST to Markdown AST const mdast = toMdast(hast);
The mdast
object represents the Markdown code as an AST, similar to the previous HTML AST:
{ type: 'root', children: [ { type: 'heading', depth: 1, children: [ { type: 'text', value: 'Hello ', position: { start: { line: 1, column: 5, offset: 4 }, end: { line: 1, column: 11, offset: 10 } } }, { type: 'strong', children: [ { type: 'text', value: 'world!', position: { start: { line: 1, column: 19, offset: 18 }, end: { line: 1, column: 25, offset: 24 } } } ], position: { start: { line: 1, column: 11, offset: 10 }, end: { line: 1, column: 34, offset: 33 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 39, offset: 38 } } }
The next step is to serialize Markdown AST into a Markdown string. We use the toMarkdown
function from the mdast-util-to-markdown
library, which takes a mdast
object as input and returns Markdown string:
import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; import { toMarkdown } from 'mdast-util-to-markdown'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); const mdast = toMdast(hast); // Markdown AST to Markdown string const markdown = toMarkdown(mdast);
The Markdown string has the format we are used to from Editors like on Dev.to or GitHub:
# Hello **world!**
The final step is to convert the Markdown string into Notion block objects that can be used to create new blocks using the Notion API. We use the markdownToBlocks
function from the @tryfabric/martian
library, which takes the Markdown string as input and returns an array of BlockObjectRequest
objects:
import { fromHtml } from 'hast-util-from-html'; import { toMdast } from 'hast-util-to-mdast'; import { toMarkdown } from 'mdast-util-to-markdown'; import { markdownToBlocks } from '@tryfabric/martian'; const html = '<h1>Hello <strong>world!</strong></h1>'; const hast = fromHtml(html, { fragment: true }); const mdast = toMdast(hast); const markdown = toMarkdown(mdast); // Markdown string to Notion Blocks const blocks = markdownToBlocks(markdown);
Each BlockObjectRequest
object represents a Notion block and contains the necessary information to create and format the block in your Notion page:
[ { object: 'block', type: 'heading_1', heading_1: { rich_text: [ { type: 'text', annotations: { bold: false, strikethrough: false, underline: false, italic: false, code: false, color: 'default' }, text: { content: 'Hello ', link: undefined } }, { type: 'text', annotations: { bold: true, strikethrough: false, underline: false, italic: false, code: false, color: 'default' }, text: { content: 'world!', link: undefined } } ] } } ]
Here is an example of how to create a new page in your Notion workspace with the blocks converted from the original HTML. We use the @notionhq/client
library and you'll need to have a valid Notion API key and database ID:
import { Client } from '@notionhq/client'; // ... const blocks = markdownToBlocks(markdown); const notion = new Client({ auth: process.env.NOTION_API_KEY }); const properties = { Name: { title: [{ text: { content: 'New Page' } }] } }; const response = await notion.pages.create({ parent: { database_id: process.env.NOTION_DATABASE_ID }, properties, children: blocks, });
The response from Notion contains the new page and its properties. However, it does not contain the blocks, since these must be queried separately.
{ object: 'page', id: 'c7d50cb4-0518-4ef9-8457-056a53fd53eb', created_time: '2023-03-01T10:28:00.000Z', last_edited_time: '2023-03-01T10:28:00.000Z', created_by: { object: 'user', id: '9d72c6df-be87-4bb5-8edb-1c19461d01bb' }, last_edited_by: { object: 'user', id: '9d72c6df-be87-4bb5-8edb-1c19461d01bb' }, cover: null, icon: null, parent: { type: 'database_id', database_id: '95970798-23cc-4f7c-a0c5-4870b80fcdee' }, archived: false, properties: { Name: { id: 'title', type: 'title', title: [ { type: 'text', text: { content: 'New Page', link: null }, annotations: { bold: false, italic: false, strikethrough: false, underline: false, code: false, color: 'default' }, plain_text: 'New Page', href: null } ] }, // ... }, url: '<https://www.notion.so/New-Page-c7d50cb405184ef98457056a53fd53eb>' }