mdslicer.mdslicer

Parse markdown file into header and sections. A header is a dictionary with the metadata of the markdown file. Sections are a list of dictionaries with the title, id and content of each section. For example:

sections =
[{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]

Functions

split_header_and_content(file_content)

Split a markdown file content into a YAML header and a content

Classes

MDSlicer([additional_parser])

Parse markdown content into metadata header and sections

class MDSlicer(additional_parser=None, **kwargs)[source]

Parse markdown content into metadata header and sections

__init__(additional_parser=None, **kwargs)[source]

Create a markdown parser with the given extensions.

Parameters:
  • additional_parser (Optional[Callable]) – Additional parser to apply on the markdown content

  • kwargs – Keyword arguments to pass to the markdown.Markdown() parser initializer (such as the list of extensions)

get_sections(html)[source]

Get sections from the HTML content by splitting it with h2 tags

Parameters:

html (str) – HTML content

Returns:

List of sections with an id, a title and an html content

Return type:

list[dict[str, str]]

Example

>>> from mdslicer import MDSlicer
>>> slicer = MDSlicer()
>>> html = "<h2>Section 1</h2><p>Content 1</p><h2>Section 2</h2><p>Content 2</p>"
>>> slicer.get_sections(html)
[{'title': 'Section 1', 'id': 'section-1', 'content': '<p>Content 1</p>'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '<p>Content 2</p>'}]
slice_content(file_content)[source]

Parse a markdown string into a YAML header and a content

Parameters:

file_content (str) – content of the markdown file

Returns:
  • header of the markdown file,

  • content sections of the markdown file

Return type:

tuple[dict, list[dict[str, str]]]

Examples

>>> slicer = MDSlicer()
>>> file_content = '''
... ---
... title: Example
... ---
...
... ## Section 1
...
... Content 1
...
... ## Section 2
...
... Content 2'''
>>> header, sections = slicer.slice_content(file_content)
>>> print(header)
{'title': 'Example'}
>>> sections
[{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]
slice_file(mdfile_path)[source]

Parse a markdown file into a YAML header and a content

Parameters:

mdfile_path (str | Path) – Path to the markdown file

Returns:
  • header of the markdown file,

  • content sections of the markdown file,

Return type:

tuple[dict, list[dict[str, str]]]

slice_md_content(md_content)[source]

Convert markdown content to HTML sections.

Parameters:

md_content (str) – Markdown content

Returns:

List of sections

Return type:

list[dict[str, str]]

Example

>>> from mdslicer import MDSlicer
>>> slicer = MDSlicer()
>>> md_content = '''
... # Title
...
... Some content
...
... ## Section 1
...
... Content 1
...
... ## Section 2
...
... Content 2'''
>>> slicer.slice_md_content(md_content)
[{'title': '', 'id': '', 'content': '<h1>Title</h1>\n<p>Some content</p>\n'},
{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
{'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]
split_header_and_content(file_content)[source]

Split a markdown file content into a YAML header and a content

Parameters:

file_content (str) – content of the markdown file

Returns:
  • header of the markdown file,

  • content of the markdown file

Return type:

tuple[dict, str]