The Substack Importer will import content from an export file downloaded from your Substack newsletter.
The following content will be imported:
In the future, we plan to improve the importer by:
For running unit tests and contributing to the plugin, see the README on GitHub.
Tests can be run with wp-env or with any local WordPress setup paired with a Docker MySQL container. Run composer install first, then vendor/bin/phpunit.
The Substack Importer provides filters and actions at key stages of the content conversion pipeline.
Filter the post metadata loaded from the Substack API before it is used for author, comments, and other post data.
Parameters:
* $post_meta (array|null) – The post metadata from the Substack API response.
* $post (array) – The raw Substack post data from the CSV.
* $id (int) – The Substack post ID.
Filter the raw HTML content before Gutenberg conversion. Runs after the subtitle has been prepended (if present). Useful for cleaning up Substack-specific HTML, adding custom elements, or stripping unwanted markup.
Parameters:
* $html_body (string) – The raw HTML content from the Substack export.
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
Filter the subtitle HTML before it is prepended to the post content. Return an empty string to skip the subtitle entirely.
Parameters:
* $heading (string) – The subtitle HTML (default: an h2 element).
* $post (array) – The raw Substack post data.
Filter the post content after Gutenberg conversion but before it is added to the WXR. Useful for wrapping paywalled content in custom blocks (e.g., membership plugins).
Parameters:
* $post_content (string) – The converted Gutenberg block content.
* $post (array) – The original Substack post data.
* $post_meta (array|null) – Additional post metadata from Substack API.
Filter the final post data array before it is added to the WXR.
Parameters:
* $post_data (array) – The post data.
* $post (array) – The original Substack post data.
Filter the result of a single node conversion to a Gutenberg block. Allows modification of the block name and attributes. Return a null block_name to skip the node.
Parameters:
* $block_data (array) – Array with ‘block_name’ and ‘block_attributes’ keys.
* $node (DOMElement) – The converted DOM node.
* $node_name (string) – The original HTML tag name (e.g. ‘p’, ‘div’, ‘h2’).
Filter the image node conversion result. Useful for adjusting image sizes, captions, or link destinations.
Parameters:
* $result (array) – Array with ‘block_attributes’ and ‘node’ keys.
* $image_data (array|null) – The decoded image data from the Substack data-attrs attribute.
Short-circuit the embed node conversion before default handling. Return a non-null array to skip the built-in switch statement entirely. Useful for handling unsupported embed types or overriding the default conversion for a specific provider.
Parameters:
* $pre_result (array|null) – Return non-null to short-circuit. Expected keys: ‘node’, ‘block_attributes’, ‘block_name’.
* $node (DOMElement) – The embed DOM node before conversion.
* $parent (DOMElement) – The parent DOM element.
* $first_class (string) – The CSS class identifying the embed type (e.g. ‘youtube-wrap’, ‘tweet’).
Filter the embed node conversion result after the default conversion. Useful for modifying embed URLs, adding custom attributes, or changing how embeds are represented.
Parameters:
* $output (array) – Array with ‘block_name’, ‘block_attributes’, and ‘node’ keys.
* $first_class (string) – The CSS class identifying the embed type.
Filter the Gutenberg audio block HTML for podcast posts.
Parameters:
* $block (string) – The Gutenberg audio block HTML.
* $audio_url (string) – The URL of the podcast audio file.
Filter the paywall marker text that appears in the imported content.
Parameters:
* $marker_text (string) – The default paywall marker text.
* $node (DOMElement) – The paywall node being converted.
* $parent (DOMElement) – The parent element.
Filter the entire paywall conversion result. Return a non-null value to override the default conversion.
Parameters:
* $result (array|null) – The conversion result, null to use default.
* $node (DOMElement) – The paywall node being converted.
* $parent (DOMElement) – The parent element.
Fires before a single Substack post is processed and converted. Useful for setting up state or performing actions before conversion begins.
Parameters:
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
* $id (int) – The Substack post ID.
Fires after a single Substack post has been converted and added to the WXR. Useful for logging, progress tracking, or performing cleanup after each post.
Parameters:
* $post_data (array) – The final post data that was added to the WXR.
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
* $id (int) – The Substack post ID.