← Back to Gene Catalog

office-ooxml-to-html

Native document.convert

Convert a single .docx or .pptx (OOXML) file from base64 to semantic HTML for downstream kb-article-normalize / ingest. No network; no filesystem. Layout and embedded objects are best-effort; complex charts may be omitted with warnings.

README

No documentation yet.

Gene authors can add a README when publishing.

Phenotype

Input

PropertyType Req Description
fileName string Original filename for hint (e.g. report.docx); optional if OOXML kind is unambiguous.
fileBase64 string Base64-encoded file bytes (standard or data-URL prefix stripped by host).

Output

PropertyType Req Description
html string Full HTML document (UTF-8) with article root suitable for M1 normalize.
title string Optional title from document properties or first heading.
format full_document Always full_document for this version.
warnings array
detectedKind docx | pptx
Raw JSON Schema

inputSchema

{
  "type": "object",
  "required": [
    "fileBase64"
  ],
  "properties": {
    "fileName": {
      "type": "string",
      "description": "Original filename for hint (e.g. report.docx); optional if OOXML kind is unambiguous."
    },
    "fileBase64": {
      "type": "string",
      "description": "Base64-encoded file bytes (standard or data-URL prefix stripped by host)."
    }
  }
}

outputSchema

{
  "type": "object",
  "required": [
    "html",
    "format",
    "detectedKind",
    "warnings"
  ],
  "properties": {
    "html": {
      "type": "string",
      "description": "Full HTML document (UTF-8) with article root suitable for M1 normalize."
    },
    "title": {
      "type": "string",
      "description": "Optional title from document properties or first heading."
    },
    "format": {
      "enum": [
        "full_document"
      ],
      "type": "string",
      "description": "Always full_document for this version."
    },
    "warnings": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "detectedKind": {
      "enum": [
        "docx",
        "pptx"
      ],
      "type": "string"
    }
  }
}