mdslicer.mdslicer¶

Parse markdown file into header and sections. A header is a dictionary with the metadata of the markdown file. Sections are a list of dictionaries with the title, id and content of each section. For example:

sections =
[{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]

Functions

split_header_and_content(file_content)

Split a markdown file content into a YAML header and a content

Classes

MDSlicer([additional_parser])

Parse markdown content into metadata header and sections

class MDSlicer(additional_parser=None, **kwargs)[source]¶

Parse markdown content into metadata header and sections

__init__(additional_parser=None, **kwargs)[source]¶

Create a markdown parser with the given extensions.

Parameters:

additional_parser (Optional[Callable]) – Additional parser to apply on the markdown content
kwargs – Keyword arguments to pass to the markdown.Markdown() parser initializer (such as the list of extensions)

get_sections(html)[source]¶

Get sections from the HTML content by splitting it with h2 tags

Parameters:: html (str) – HTML content
Returns:: List of sections with an id, a title and an html content
Return type:: list[dict[str, str]]

Example

>>> from mdslicer import MDSlicer
>>> slicer = MDSlicer()
>>> html = "<h2>Section 1</h2><p>Content 1</p><h2>Section 2</h2><p>Content 2</p>"
>>> slicer.get_sections(html)
[{'title': 'Section 1', 'id': 'section-1', 'content': '<p>Content 1</p>'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '<p>Content 2</p>'}]

slice_content(file_content)[source]¶

Parse a markdown string into a YAML header and a content

Parameters:

file_content (str) – content of the markdown file

Returns:

header of the markdown file,
content sections of the markdown file

Return type:

tuple[dict, list[dict[str, str]]]

Examples

>>> slicer = MDSlicer()
>>> file_content = '''
... ---
... title: Example
... ---
...
... ## Section 1
...
... Content 1
...
... ## Section 2
...
... Content 2'''
>>> header, sections = slicer.slice_content(file_content)
>>> print(header)
{'title': 'Example'}
>>> sections
[{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
 {'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]

slice_file(mdfile_path)[source]¶

Parse a markdown file into a YAML header and a content

Parameters:

mdfile_path (str | Path) – Path to the markdown file

Returns:

header of the markdown file,
content sections of the markdown file,

Return type:

tuple[dict, list[dict[str, str]]]

slice_md_content(md_content)[source]¶

Convert markdown content to HTML sections.

Parameters:: md_content (str) – Markdown content
Returns:: List of sections
Return type:: list[dict[str, str]]

Example

>>> from mdslicer import MDSlicer
>>> slicer = MDSlicer()
>>> md_content = '''
... # Title
...
... Some content
...
... ## Section 1
...
... Content 1
...
... ## Section 2
...
... Content 2'''
>>> slicer.slice_md_content(md_content)
[{'title': '', 'id': '', 'content': '<h1>Title</h1>\n<p>Some content</p>\n'},
{'title': 'Section 1', 'id': 'section-1', 'content': '\n<p>Content 1</p>\n'},
{'title': 'Section 2', 'id': 'section-2', 'content': '\n<p>Content 2</p>'}]

split_header_and_content(file_content)[source]¶

Split a markdown file content into a YAML header and a content

Parameters:

file_content (str) – content of the markdown file

Returns:

header of the markdown file,
content of the markdown file

Return type:

tuple[dict, str]