Static website conversion tutorial
This chapter features the conversion of a static HTML website to a WordPress site using @cm4all-wp-impex/generator
and ImpEx WordPress Plugin.
The sources can be found at the ImpEx WordPress plugin GitHub repository.
- About
- Conversion process
- Whats missing ?
- Local Development using cm4all-wp-impex
- Full conversion script
About
This is a full featured example of converting a regular static website of a fictional german dentist to a WordPress site.
The web site is available offline at directory ./homepage-dr-mustermann
.
You can view the website by
- starting the PHP built-in webserver :
php -S localhost:8080 -t homepage-dr-mustermann/
- and open the website in your browser :
http://localhost:8080/
.
Watch the walk-trough on YouTube
(German audio with english sub titles.)
Conversion process
The conversion process is implemented in a single file ./index.js
:
-
scanning for html and media files from the filesystem using plain NodeJS
-
converting the HTML files to ImpEx slice JSON using
ImpexTransformer
andImpexSliceFactory
from package@cm4all-wp-impex/generator
. The HTML transformation is customized in thesetup(...)
function. -
creating ImpEx slice JSON for the media files using
ImpexSliceFactory
from package@cm4all-wp-impex/generator
-
saving the ImpEx slice JSON to the filesystem using paths generated by
ImpexSliceFactory.PathGenerator
from package@cm4all-wp-impex/generator
- the media files are saved to the filesystem using paths adapted from the
ImpexSliceFactory.PathGenerator
generated paths for the slice files (as expected by the ImpEx Export format).
- the media files are saved to the filesystem using paths adapted from the
The conversion process is implemented in less than 240 lines of code thanks to package @cm4all-wp-impex/generator
.
You can run the conversion script by executing ./index.js
(can be found at the GitHub repository : packages/@cm4all-wp-impex/generator/examples/impex-complete-static-homepage-conversion\index.js
Ensure the right nodejs version is active before using
nvm install
and to install the required NodeJS dependencies usingnpm ci
.
Ensure that you've installed the script dependencies by entering directory
cm4all-wp-impex/packages/@cm4all-wp-impex/generator
and executingnpm ci
.
The result is a folder generated-impex-import/
containing the generated ImpEx export folder layout containing the ImpEx slice JSON files and media files.
This export can now be imported into WordPress using ImpEx CLI :
impex-cli.php import -username=<adminusername> -password=<adminpassword> -rest-url=<your-wordpress-rest-api-endpoint> ./generated-impex-export/
(Replace the <placeholder>
with your own values.)
Ensure your WordPress instance is empty (does not contain any pages/posts/media).
After executing the command the website contents are imported into your WordPress instance.
The example website and conversion script is intentionally simple.
Since every website is different, the conversion process cannot be universal work for every website.
By implementing additional transformation rules using the hooks known by Transformer.setup(...)
function of @cm4all-wp-impex/generator
almost any detail of a website can be converted to a WordPress post/page.
Whats missing ?
The example does not cover every detail of a website conversion, only the content. But that's intentional.
Possible improvements:
-
The navigation bar could be converted to a custom WordPress nav_menu.
Navigation is different handled in FSE and classic themes. In a FSE you would generate a Navigation block, in a classic theme it works different. It depends on the target WordPress environment how to take over navigation.
-
Styles are ignored in the example.
Because it depends on the goal of the transformation. If the content should be styled completely by a WordPress theme providing the complete styling, this is not needed.
But if needed, style properties like fonts and colors could be introspected and transformed to FSE theme.json settings.
-
Contact form will be taken over as
core/html
block. Submitting the form does not work in the example.WordPress/Gutenberg does not provide a generic Form block. There is no option to convert the HTML form to something matching using plain WordPress / Gutenberg.
But the form could be easily converted into a Ninja Form or any other form builder plugin available for WordPress.
To keep the example simple and working without depending on additional plugins like Ninja Forms the example ist just converted to a
core/html
block.So it depends on your target WordPress environment (and available plugins) how the conversion will be implemented.
-
The overall layout (header/footer/main section) is also ignored (but could be converted to FSE part templates).
But : as you might guess - all these improvements may vary depending on the goal.
The important message is : Everything is possible, but because it's individual - it's up to you 💪
Local Development using cm4all-wp-impex
-
(optional) cleanup local wp-env installation :
(cd $(git rev-parse --show-toplevel) && make wp-env-clean)
-
import using ImpEx cli :
$(git rev-parse --show-toplevel)/impex-cli/impex-cli.php import -username=admin -password=password -rest-url=http://localhost:8888/wp-json -profile=all ./generated-impex-export/
Full conversion script
#!/usr/bin/env node
/*
* @cm4all-wp-impex/generator usage example converting a whole static homepage to an impex export
*/
import { resolve, join, extname, dirname, basename } from "path";
import { readdir, readFile, mkdir, rm, writeFile, copyFile } from "fs/promises";
import { ImpexTransformer, ImpexSliceFactory } from "../../src/index.js";
/**
* STATIC_HOMEPAGE_DIRECTORY is the directory containing the static homepage
*/
const STATIC_HOMEPAGE_DIRECTORY = new URL(
"homepage-dr-mustermann",
import.meta.url
).pathname;
/**
* generator function yielding matched files recursively
*
* @param {string} dir directory to search
* @param {boolean} recursive whether to search recursively
* @param {string|undefined} extension file extension to match or null to match all files
*
* @yields {string} path to file
*/
async function* getFiles(dir, recursive, extension) {
const entries = await readdir(dir, { withFileTypes: true });
for (const entry of entries) {
const res = resolve(dir, entry.name);
if (entry.isDirectory()) {
yield* getFiles(res, recursive, extension);
} else if (!extension || entry.name.endsWith(extension)) {
yield res;
}
}
}
/**
* keeps track of images and their references from html files (aka pages)
* key is image path relative to STATIC_HOMEPAGE_DIRECTORY
* value is array of image references
*/
const img2imgSrc_mappings = {};
/**
* set up the ImpexTransformer singleton
*
* @return {ImpexSliceFactory}
*/
function setup() {
ImpexTransformer.setup({
onDomReady(document, options = { path: null }) {
// replace <header> elements with the <ul> child
for (const section of document.querySelectorAll("header")) {
const ul = document.querySelector("ul.pure-menu-list");
section.replaceWith(ul.cloneNode(true));
}
// replace <section> elements with its inner contents
for (const section of document.querySelectorAll("section")) {
for (const child of section.childNodes) {
section.parentNode.insertBefore(child.cloneNode(true), section);
}
section.remove();
}
// replace <footer> elements with <p>
for (const footer of document.querySelectorAll("footer")) {
const paragraph = document.createElement("p");
//paragraph.setAttribute("class", "footer");
paragraph.innerHTML = footer.innerHTML;
footer.replaceWith(paragraph);
}
if (options?.path) {
// grab all image references and remember them for later processing
for (const img of document.querySelectorAll("img")) {
const src = img.getAttribute("src");
// compute image path relative to static webpage directory
const imgPath = resolve(
join(STATIC_HOMEPAGE_DIRECTORY, src)
).substring(STATIC_HOMEPAGE_DIRECTORY);
// add reference to image path
(
img2imgSrc_mappings[imgPath] || (img2imgSrc_mappings[imgPath] = [])
).push(src);
}
}
},
});
return new ImpexSliceFactory();
}
async function main() {
// setup ImpexTransformer singleton and get a ImpexSliceFactory instance
const impexSliceFactory = setup();
// group files by type (html or attachment)
const attachmentResources = [];
const htmlResources = [];
// iterate over all files recursively in STATIC_HOMEPAGE_DIRECTORY
for await (const res of getFiles(STATIC_HOMEPAGE_DIRECTORY, true)) {
const resource = res.toString();
switch (extname(res)) {
// stick HTML files into htmlResources
case ".html":
htmlResources.push({ resource });
console.log("HTML %s", resource);
break;
// stick media files into attachmentResources
case ".jpeg":
case ".jpg":
case ".gif":
case ".png":
attachmentResources.push({ resource });
console.log("ATTACHMENT %s", resource);
break;
}
}
// get a generator function yielding ImpEx export format conformant paths
const slicePathGenerator = ImpexSliceFactory.PathGenerator();
// compute target directory
const IMPEX_EXPORT_DIR = new URL("generated-impex-export", import.meta.url)
.pathname;
// delete already existing directory if it exists
try {
await rm(IMPEX_EXPORT_DIR, { recursive: true });
} catch {}
// create target directory
await mkdir(IMPEX_EXPORT_DIR, { recursive: true });
// convert html files to gutenberg annotated block content
for (const htmlResource of htmlResources) {
// transform html body to gutenberg annotated block content
htmlResource.content = ImpexTransformer.transform(
await readFile(htmlResource.resource, "utf8"),
{ path: htmlResource.resource }
);
// remember html metadata for later processing
htmlResource.title =
document.querySelector("head > title")?.textContent ?? "";
htmlResource.description =
document
.querySelector('head > meta[name="description"]')
?.getAttribute("content") ?? "";
htmlResource.keywords = (
document
.querySelector('head > meta[name="keywords"]')
.getAttribute("content") ?? ""
)
.toLowerCase()
.split(" ");
// create ImpEx slice json content for this html file
const slice = impexSliceFactory.createSlice(
"content-exporter",
(factory, slice) => {
slice.data.posts[0]["wp:post_type"] = "page";
slice.data.posts[0].title = htmlResource.title;
slice.data.posts[0]["wp:post_excerpt"] = htmlResource.title;
slice.data.posts[0]["wp:post_content"] = htmlResource.content;
// @TODO: categories (aka keywords)
// @TODO: add navigation
return slice;
}
);
// compute ImpEx conform slice json file path
const slicePath = join(IMPEX_EXPORT_DIR, slicePathGenerator.next().value);
await mkdir(dirname(slicePath), {
recursive: true,
});
// write json to file
await writeFile(slicePath, JSON.stringify(slice, null, 2));
}
// make media files available as ImpEx slices
for (const attachmentResource of attachmentResources) {
// create ImpEx slice json content for this media file
const slice = impexSliceFactory.createSlice(
"attachment",
(factory, slice) => {
// apply relative path as content
slice.data = attachmentResource.resource.substring(
IMPEX_EXPORT_DIR.length + 1
);
// compute unique image file=>[img[@src]] mapping for this attachment
let img2imgSrc_mapping = [
...new Set(img2imgSrc_mappings[attachmentResource.resource] ?? []),
];
// add mapping to slice metadata
slice.meta["impex:post-references"] = img2imgSrc_mapping;
return slice;
}
);
// compute ImpEx conform slice json file path
const slicePath = join(IMPEX_EXPORT_DIR, slicePathGenerator.next().value);
await mkdir(dirname(slicePath), {
recursive: true,
});
// write slice json to file
await writeFile(slicePath, JSON.stringify(slice, null, 2));
// copy attachment file to target directory with ImpEx conform file name
await copyFile(
attachmentResource.resource,
slicePath.replace(".json", "-" + basename(attachmentResource.resource))
);
}
// JSDOM is preventing automatic process termination so we need to force it
process.exit(0);
}
main();
The script is also available at ImpEx WordPress plugin GitHub repository