Word Documents
Lab 2.1: Word Documents
Section titled “Lab 2.1: Word Documents”Extract and convert DOCX to clean markdown
Start the Lab
Section titled “Start the Lab” Process Word documents
/CC.m2.lb1 What You’ll Learn
Section titled “What You’ll Learn”- Identify the key structural elements that Claude Code preserves when extracting content from Word documents (headings, lists, tables) versus elements that are lost (fonts, colours, complex formatting)
- Apply the docx skill to convert messy Word documents into clean, structured markdown whilst preserving semantic meaning and logical hierarchy
- Create properly tagged markdown files with YAML frontmatter that integrate processed documents into the knowledge vault for future searchability
Key Concepts
Section titled “Key Concepts”| Concept | Description |
|---|---|
| Content Extraction | The process of pulling meaningful information from documents whilst discarding visual formatting noise |
| XML Structure | The underlying document structure that Claude Code parses, containing headings, paragraphs, lists, and tables |
| YAML Frontmatter | Metadata block at the start of markdown files containing title, tags, and other searchable properties |
| Semantic Meaning | The actual meaning and logical relationships within content, independent of how it appears visually |
| Metadata Tags | Descriptive labels added to processed documents that enable searching and filtering in the knowledge vault |
What You’ll Create
Section titled “What You’ll Create”customer-notes.md- Clean markdown with metadata
Eureka Moment
Section titled “Eureka Moment”“It extracted all that mess into clean sections in under a minute - I’ve spent entire afternoons doing this manually!”
Navigation: Module 1: Lab 5 | Module Overview | Lab 2: PowerPoints