Package @cm4all-wp-impex/generator

@cm4all-wp-impex/generator simplifies the conversion of any content or website to WordPress using ImpEx WordPress Plugin.

This package provides a foundation of JavaScript functions/classes for transforming almost any kind of data into WordPress content.

@cm4all-wp-impex/generator is especially useful for converting bare HTML content and website-builder/CMS generated HTML into WordPress content.

The framework does not require a WordPress instance. It rather offers an extensible platform for generating WordPress content consumable by the ImpEx WordPress plugin.

ImpEx is a Open Source WordPress plugin for importing / exporting WordPress data. @cm4all-wp-impex/generator is part of the ImpEx WordPress plugin project.

Watch the tutorial on YouTube:

Watch the video

Details

The ImpEx WordPress plugin specifies a JSON file based import/export format for WordPress content).

@cm4all-wp-impex/generator provides

Last but not least @cm4all-wp-impex/generator includes a full featured example transforming a complete static website into WordPress content consumable by ImpEx WordPress plugin. The example is the perfect starting point for creating your own WordPress content generator.

Installation

npm install @cm4all-wp-impex/generator

Development

  • clone ImpEx WordPress plugin Git repository project : git clone https://github.com/IONOS-WordPress/cm4all-wp-impex.git

  • cd into the @cm4all-wp-impex/generator sub-project : cd packages/@cm4all-wp-impex/generator

  • ensure the correct NodeJS Version (see https://github.com/IONOS-WordPress/cm4all-wp-impex/blob/develop/.nvmrc) is installed : nvm install

  • install package dependencies : npm ci

  • run the tests : npm run test

tests

requirements

tests require the diff command to be available.

Usage

@cm4all-wp-impex/generator exposes an API for generating WordPress content.

API

To use the API just import the exposed API into your code.

import { ImpexTransformer, traverseBlocks, ImpexSliceFactory, migrate } from `@cm4all-wp-impex/generator`;

Transforming data into WordPress content

Data transformation into Gutenberg block annotated HTML is done by the ImpexTransformer singleton.

ImpexTransformer can be configured by calling it's setup(...) function supporting various hooks for customizing the transformation.

ImpexTransformer.transform(data) transforms the content provided in the data argument into Gutenberg block annotated HTML.

ImpexTransformer.setup({/* options */})

Options
  • verbose (boolean, default : false) enables verbose output for debugging purposes

  • onLoad(data : any) : string (function, default : undefined) callback executed by transform(...) function. data argument is the initial data to transform.

    This callback is intended to be used for converting the initial data into HTML.

    Example: If your initial data is markdown content this callback should transform it to HTML:

    ...
    ImpexTransformer.setup({
      onLoad(data) {
        return markdown.toHTML(data);
      }
    });
    ...
    

    If onLoad is not defined the transform function will assume the data argument is valid HTML.

  • onDomReady(Document : document) : void (function, default : undefined) callback executed when HTML is loaded and the DOM is ready.

    At this stage, you can use the HTML DOM manipulation API (querySelector for example) to rearrange the HTML DOM the way you need.

    The Transformer uses JSDOM to provide DOM capabilities to NodeJS. So you can use everything you know about DOM manipulation in NodeJS.

    See tests for example usage.

  • onRegisterCoreBlocks() : boolean (function, default : undefined) callback to register Gutenberg blocks.

    This callback is the power horse transforming HTML to Gutenberg block annotated HTML.

    Most transformation work is delegated to the Gutenberg Block Transforms API. This API processes the given DOM and applies the Gutenberg Block transformations of all registered blocks. The result is valid Gutenberg block annotated HTML as we want it.

    Using the onRegisterCoreBlocks callback you can register your own Gutenberg blocks (including their transform rules) or attach additional transform rules to existing core Gutenberg blocks utilizing Gutenberg filter 'blocks.registerBlockType'.

    If your onRegisterCoreBlocks callback returns true, the core Gutenberg blocks transform rules will be reset to its defaults.

    If onRegisterCoreBlocks is not given, transform(...) will assume that the core Gutenberg blocks should be used as-is.

    See tests for example usage.

  • onSerialize(blocks : array) : array (function, default : undefined) callback executed after the Gutenberg block transform rules have been applied.

    The resulting array of Gutenberg blocks is passed to the callback. The callback can modify the blocks array and is expected to return them.

    Example transforming all Gutenberg Image block attributes into caption block attribute. This will result in a <figcaption> element inside the block output:

    ImpexTransformer.setup({
      onSerialize(blocks) {
        // takeover img[@title] as figcaption in every block
        for (const block of traverseBlocks(blocks)) {
          if (block.name === "core/image") {
            block.attributes.caption = block.attributes.title;
            delete block.attributes.title;
          }
        }
    
        return blocks;
      },
    });
    

    traverseBlocks is a helper function exposed by this package to traverse the Gutenberg block hierarchy like a flat array.

    See tests for example usage.

ImpexTransformer.transform(data : any) : string

The transform function transforms the given data into Gutenberg block annotated HTML.

The data argument can be anything. All hooks configured using ImpexTransformer.setup(...) will take effect by executing this function.

The returned string is valid Gutenberg block annotated HTML.

Encapsulate Gutenberg block annotated HTML in ImpEx slice JSON data structure

To import the generated Gutenberg block annotated HTML into WordPress we need to generate ImpEx WordPress plugin conform JSON files wrapping the content with WordPress meta-data.

Class ImpexSliceFactory provides a simple way to generate WordPress ImpEx Slice JSON structures.

At first we need to create an instance of ImpexSliceFactory:

const sliceFactory = new ImpexSliceFactory({
  /* options */
});

There is just one (optional) option next_post_id : integer (default : 1) which might be used to provide a individual start post_id. next_post_id is only taken into account when creating content slices for WordPress content like posts/pages or media.

The ImpEx WordPress plugin supports some more slice types (for exporting whole database tables and more) but in these cases next_post_id is not in use.

Using the ImpexSliceFactory instance we've created we can now generate WordPress ImpEx Slice JSON structures for WordPress content or media by calling function createSlice(sliceType : string, callback(factory, sliceJson : any) : any).

The sliceType argument is the type of the slice to be created.

The callback function is called with the ImpexSliceFactory instance and the generated slice JSON structure as parameters .

Encapsulate WordPress content into ImpEx JSON

Creating the JSON for a WordPress post is dead simple :

const slice = sliceFactory.createSlice("content-exporter", (factory, slice) => {
  slice.data.posts[0].title = "Hello";
  slice.data.posts[0]["wp:post_content"] =
    "<!-- wp:paragraph --><p>my friend</p><!-- /wp:paragraph -->";
  return slice;
});

Creating a WordPress page with some additional WordPress meta-data works the same way:

const slice = sliceFactory.createSlice("content-exporter", (factory, slice) => {
  slice.data.posts[0].title = "Hello";
  slice.data.posts[0]["wp:post_type"] = "page";
  slice.data.posts[0]["wp:post_excerpt"] = "A page about my friend";
  slice.data.posts[0]["wp:post_content"] =
    "<!-- wp:paragraph --><p>Hello my my friend</p><!-- /wp:paragraph -->";
  return slice;
});

Encapsulate WordPress attachments like images into ImpEx JSON

Creating the JSON for a WordPress attachment is even dead simple :

// declares a attachment for image './foo.jpg'
const slice = sliceFactory.createSlice("attachment", (factory, slice) => {
  slice.data = "./foo.jpg";

  return slice;
});

In most cases, our imported content (aka posts/pages) will reference the media in various ways like /image/foo.jpg or ../../images/foo.jpg and so on.

ImpEx WordPress plugin will take care about replacing image references in Gutenberg block annotated HTML if we provide a replacement hint impex:post-references (see Attachments (like Pictures and Videos) for details).

const slice = sliceFactory.createSlice("attachment", (factory, slice) => {
  slice.data = "./foo.jpg";
  // will result in replacing all matching references in posts of the WordPress instance with the link to the imported image
  slice.meta["impex:post-references"] = [
    "/image/foo.jpg",
    "../../images/foo.jpg",
  ];

  return slice;
});

Generate filenames for the JSON slice data in ImpEx Export format

The ImpEx WordPress plugin imports and exports data into a directory structure according to the ImpEx Export format.

@cm4all-wp-impex/generator supports creating the correct paths by providing a static generator function SliceFactory.PathGenerator().

This function returns a Generator function yielding a new relative path each time it's next() function is called.

The optional SliceFactory.PathGenerator(max_slices_per_chunk : integer = 10) function parameter may be used to limit the number of slices per chunk directory to a custom value.

import { ImpexSliceFactory } from "@cm4all-wp-impex/generator";
...

const pathGenerator = ImpexSliceFactory.PathGenerator();
...

// 2  => only 2 slice files per chunk directory
const gen = SliceFactory.PathGenerator(2);

console.log(gen.next().value); // => "chunk-0001/slice-0001.json"
console.log(gen.next().value); // => "chunk-0001/slice-0002.json");
console.log(gen.next().value); // => "chunk-0002/slice-0001.json");
console.log(gen.next().value); // => "chunk-0002/slice-0002.json");
console.log(gen.next().value); // => "chunk-0003/slice-0001.json");
console.log(gen.next().value); // => "chunk-0003/slice-0002.json");
...

See tests and static website transformation example for real world usage.

Migrate existing ImpEx export

The migrate export provides you the option to transform existing ImpEx export data with minimal boilerplate code.

Example use case : Suppose you want to transform your WordPress pages created using a Pagebuilder like Elementor into true Gutenberg pages ...

  • Import the migrate function from the package using the following snippet:

    import { migrate } from "@cm4all-wp-impex/generator";
    
  • Synopsis : async migrate(sourcePath, targetPath, sliceCallback, options = {})

    The function traverses all ImpEx chunk sub directories and delegates the found slice files to the given callback argument.

    Slice files will be delegated to the callback ordered by sub chunk directory name and slice file name.

    Arguments:

    • sourcePath : string

      path to an ImpEx export directory containing the exported data to transform

    • targetPath : string

      Directory to write the transformed ImpEx export to. Will be created if it does not exist.

    • sliceCallback : function

      A callback function called for every ImpEx slice file.

      The provided callback can handle (transform its content into one or more target slice files or just suppress it) the slice file by itself and return a truthy value.

      Otherwise the migrate functions default mechanism will taken into account and the slice will be copied to the target directory including subsidiary file(s) in case of a attachment slice.

      A async callback as argument is also supported.

      Arguments:

      • slicePath : string

        The absolute path to the the current slice file. It's up to the callback to load/parse/process the slice JSON file.

      • pathGenerator : SliceFactory.PathGenerator

        An preconfigured instance of SliceFactory.PathGenerator provided by this package. Using the pathGenerator allows you to generate valid ImpEx target slice file paths.

      • targetPath : string

        The path of the resulting ImpEx export directory.

      • options : object

        The options object provided to the migrate function.

      Return:

      • truthy migrate assumes the slice was already consumed/processed by the callback

      • falsy the migrate alorithm will copy the slice (and associated files in case of a attachment slice) to the target directory

    • options : object

      An optional argument to customize the migrate function behaviour.

      Keys :

      • onStart : function

        Will be called right before migrate will call the sliceCallback the first time.

        If the ImpEx export is empty (=> no slice files in the export directory) the onStart callback will never be called.

        A async function callback is supported.

      • onFinish : function

        Will be called after migrate has called the sliceCallback the last time.

        If the ImpEx export is empty (=> no slice files in the export directory) the onFinish callback will never be called.

        A async function callback is supported.

Example migrate usage

  • strip out any content of an ImpEx export except attachment posts (Images/Videos) using migrate.
await migrate(
  './my-impex-export', 
  './my-migrated-impex-export', 
  async (slicePath) => {
    const slice = JSON.parse(readFile(slicePath));

    // return truthy for all slices except attachments
    return slice.tag!=='attachment';
  }
);

Checkout the package test cases for further usage examples.