API Reference

pysubs2 — the main module

class pysubs2.Color(r: int, g: int, b: int, a: int = 0)

8-bit RGB color with alpha channel.

All values are ints from 0 to 255.

pysubs2.load(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) SSAFile

Alias for SSAFile.load().

pysubs2.load_from_whisper(result_or_segments: Dict[str, Any] | List[Dict[str, Any]]) SSAFile

Alias for pysubs2.whisper.load_from_whisper().

pysubs2.make_time(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) int

Alias for pysubs2.time.make_time().

enum pysubs2.Alignment(value)

An integer enum specifying text alignment

The integer values correspond to Advanced SubStation Alpha definition (like on numpad). Note that the older SubStation Alpha (SSA) specification used different numbering schema.

Member Type:

int

Valid values are as follows:

BOTTOM_LEFT = <Alignment.BOTTOM_LEFT: 1>
BOTTOM_CENTER = <Alignment.BOTTOM_CENTER: 2>
BOTTOM_RIGHT = <Alignment.BOTTOM_RIGHT: 3>
MIDDLE_LEFT = <Alignment.MIDDLE_LEFT: 4>
MIDDLE_CENTER = <Alignment.MIDDLE_CENTER: 5>
MIDDLE_RIGHT = <Alignment.MIDDLE_RIGHT: 6>
TOP_LEFT = <Alignment.TOP_LEFT: 7>
TOP_CENTER = <Alignment.TOP_CENTER: 8>
TOP_RIGHT = <Alignment.TOP_RIGHT: 9>

SSAFile — a subtitle file

class pysubs2.SSAFile

Subtitle file in SubStation Alpha format.

This class has a list-like interface which exposes SSAFile.events, list of subtitles in the file:

subs = SSAFile.load("subtitles.srt")

for line in subs:
    print(line.text)

subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle"))

del subs[0]
aegisub_project: Dict[str, str]

Dict with Aegisub project, ie. [Aegisub Project Garbage].

events: List[SSAEvent]

List of SSAEvent instances, ie. individual subtitles.

fonts_opaque: Dict[str, Any]

Dict with embedded fonts, ie. [Fonts].

format: str | None

Format of source subtitle file, if applicable, eg. "srt".

fps: float | None

Framerate used when reading the file, if applicable.

info: Dict[str, str]

Dict with script metadata, ie. [Script Info].

styles: Dict[str, SSAStyle]

Dict of SSAStyle instances.

Reading and writing subtitles

Using path to file

classmethod SSAFile.load(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) SSAFile

Load subtitle file from given path.

This method is implemented in terms of SSAFile.from_file().

See also

Specific formats may implement additional loading options, please refer to documentation of the implementation classes (eg. pysubs2.formats.subrip.SubripFormat.from_file())

Parameters:
  • path (str) – Path to subtitle file.

  • encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.

  • errors (Optional[str]) –

    Error handling for character encoding of input file. Defaults to "surrogateescape". See documentation of builtin open() function for more.

    Changed in version 2.0.0: The errors parameter was introduced to facilitate pass-through of subtitle files with unknown text encoding. Previous versions of the library behaved as if errors=None.

  • format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.

  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).

  • kwargs – Extra options for the reader.

Returns:

SSAFile

Raises:

Note

pysubs2 may autodetect subtitle format and/or framerate. These values are set as SSAFile.format and SSAFile.fps attributes.

Example

>>> subs1 = pysubs2.load("subrip-subtitles.srt")
>>> subs2 = pysubs2.load("microdvd-subtitles.sub",fps=23.976)
>>> subs3 = pysubs2.load("subrip-subtitles-with-fancy-tags.srt",keep_unknown_html_tags=True)
SSAFile.save(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) None

Save subtitle file to given path.

This method is implemented in terms of SSAFile.to_file().

See also

Specific formats may implement additional saving options, please refer to documentation of the implementation classes (eg. pysubs2.formats.subrip.SubripFormat.to_file())

Parameters:
  • path (str) – Path to subtitle file.

  • encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.

  • format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.

  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. When omitted, SSAFile.fps value is used (ie. the framerate used for loading the file, if any). When the SSAFile wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See also SSAFile.transform_framerate() for fixing bad frame-based to time-based conversions.

  • errors (Optional[str]) –

    Error handling for character encoding, defaults to "surrogateescape". See documentation of builtin open() function for more.

    Changed in version 2.0.0: The errors parameter was introduced to facilitate pass-through of subtitle files with unknown text encoding. Previous versions of the library behaved as if errors=None.

  • kwargs – Extra options for the writer.

Raises:

Using string

classmethod SSAFile.from_string(string: str, format_: str | None = None, fps: float | None = None, **kwargs: Any) SSAFile

Load subtitle file from string.

See SSAFile.load() for full description.

Parameters:
  • string (str) – Subtitle file in a string. Note that the string must be Unicode (str, not bytes).

  • format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.

  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).

Returns:

SSAFile

Example

>>> text = '''
... 1
... 00:00:00,000 --> 00:00:05,000
... An example SubRip file.
... '''
>>> subs = SSAFile.from_string(text)
SSAFile.to_string(format_: str, fps: float | None = None, **kwargs: Any) str

Get subtitle file as a string.

See SSAFile.save() for full description.

Returns:

str

Using file object

classmethod SSAFile.from_file(fp: TextIO, format_: str | None = None, fps: float | None = None, **kwargs: Any) SSAFile

Read subtitle file from file object.

See SSAFile.load() for full description.

Note

This is a low-level method. Usually, one of SSAFile.load() or SSAFile.from_string() is preferable.

Parameters:
  • fp (file object) – A file object, ie. TextIO instance. Note that the file must be opened in text mode (as opposed to binary).

  • format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.

  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).

Returns:

SSAFile

SSAFile.to_file(fp: TextIO, format_: str, fps: float | None = None, **kwargs: Any) None

Write subtitle file to file object.

See SSAFile.save() for full description.

Note

This is a low-level method. Usually, one of SSAFile.save() or SSAFile.to_string() is preferable.

Parameters:

fp (file object) – A file object, ie. TextIO instance. Note that the file must be opened in text mode (as opposed to binary).

Retiming subtitles

SSAFile.shift(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) None

Shift all subtitles by constant time amount.

Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.

Parameters:
  • h – Integer or float values, may be positive or negative (hours).

  • m – Integer or float values, may be positive or negative (minutes).

  • s – Integer or float values, may be positive or negative (seconds).

  • ms – Integer or float values, may be positive or negative (milliseconds).

  • frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.

  • fps (float) – When specified, must be a positive number.

Raises:

ValueError – Invalid fps or missing number of frames.

SSAFile.transform_framerate(in_fps: float, out_fps: float) None

Rescale all timestamps by ratio of in_fps/out_fps.

Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.

Parameters:
  • in_fps (float)

  • out_fps (float)

Raises:

ValueError – Non-positive framerate given.

Working with styles

SSAFile.rename_style(old_name: str, new_name: str) None

Rename a style, including references to it.

Parameters:
  • old_name (str) – Style to be renamed.

  • new_name (str) – New name for the style (must be unused).

Raises:
  • KeyError – No style named old_name.

  • ValueError – new_name is not a legal name (cannot use commas) or new_name is taken.

SSAFile.import_styles(subs: SSAFile, overwrite: bool = True) None

Merge in styles from other SSAFile.

Parameters:
  • subs (SSAFile) – Subtitle file imported from.

  • overwrite (bool) – On name conflict, use style from the other file (default: True).

Misc methods

SSAFile.remove_miscellaneous_events() None

Remove subtitles which appear to be non-essential (the –clean in CLI)

Currently, this removes events matching any of these criteria: - SSA event type Comment - SSA drawing tags - Less than two characters of text - Duplicated text with identical time interval (only the first event is kept)

SSAFile.equals(other: SSAFile) bool

Equality of two SSAFiles.

Compares SSAFile.info, SSAFile.styles and SSAFile.events. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.

Useful mostly in unit tests. Differences are logged at DEBUG level.

SSAFile.sort() None

Sort subtitles time-wise, in-place.

SSAEvent — one subtitle

class pysubs2.SSAEvent(start: int = 0, end: int = 10000, text: str = '', marked: bool = False, layer: int = 0, style: str = 'Default', name: str = '', marginl: int = 0, marginr: int = 0, marginv: int = 0, effect: str = '', type: str = 'Dialogue')

A SubStation Event, ie. one subtitle.

In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see SSAEvent.FIELDS for a list). Additionaly, there are some convenience properties like SSAEvent.plaintext or SSAEvent.duration.

This class defines an ordering with respect to (start, end) timestamps.

Tip

Use pysubs2.make_time() to get times in milliseconds.

Example:

>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
property FIELDS: FrozenSet[str]

All fields in SSAEvent.

copy() SSAEvent

Return a copy of the SSAEvent.

property duration: int | float

Subtitle duration in milliseconds (read/write property).

Writing to this property adjusts SSAEvent.end. Setting negative durations raises ValueError.

effect: str = ''

Line effect

end: int = 10000

Subtitle end time (in milliseconds)

equals(other: SSAEvent) bool

Field-based equality for SSAEvents.

property is_comment: bool

When true, the subtitle is a comment, ie. not visible (read/write property).

Setting this property is equivalent to changing SSAEvent.type to "Dialogue" or "Comment".

property is_drawing: bool

Returns True if line is SSA drawing tag (ie. not text)

property is_text: bool

Returns False for SSA drawings and comment lines, True otherwise

In general, for non-SSA formats these events should be ignored.

layer: int = 0

Layer number, 0 is the lowest layer (ASS only)

marginl: int = 0

Left margin

marginr: int = 0

Right margin

marginv: int = 0

Vertical margin

marked: bool = False

(SSA only)

name: str = ''

Actor name

property plaintext: str

Subtitle text as multi-line string with no tags (read/write property).

Writing to this property replaces SSAEvent.text with given plain text. Newlines are converted to \N tags.

shift(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) None

Shift start and end times.

See SSAFile.shift() for full description.

start: int = 0

Subtitle start time (in milliseconds)

style: str = 'Default'

Style name

text: str = ''

Text of subtitle (with SubStation override tags)

type: str = 'Dialogue'

Line type (Dialogue/Comment)

SSAStyle — a subtitle style

class pysubs2.SSAStyle(fontname: str = 'Arial', fontsize: float = 20.0, primarycolor: ~pysubs2.common.Color = <factory>, secondarycolor: ~pysubs2.common.Color = <factory>, tertiarycolor: ~pysubs2.common.Color = <factory>, outlinecolor: ~pysubs2.common.Color = <factory>, backcolor: ~pysubs2.common.Color = <factory>, bold: bool = False, italic: bool = False, underline: bool = False, strikeout: bool = False, scalex: float = 100.0, scaley: float = 100.0, spacing: float = 0.0, angle: float = 0.0, borderstyle: int = 1, outline: float = 2.0, shadow: float = 2.0, alignment: ~pysubs2.common.Alignment = Alignment.BOTTOM_CENTER, marginl: int = 10, marginr: int = 10, marginv: int = 10, alphalevel: int = 0, encoding: int = 1, drawing: bool = False)

A SubStation Style.

In SubStation, each subtitle (SSAEvent) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; see SSAStyle.FIELDS for a list (note the spelling, which is different from SubStation proper).

Subtitles and styles are connected via an SSAFile they belong to. SSAEvent.style is a string which is (or should be) a key in the SSAFile.styles dict. Note that style name is stored separately; a given SSAStyle instance has no particular name itself.

This class defines equality (equality of all fields).

property FIELDS: FrozenSet[str]

All fields in SSAStyle.

alignment: Alignment = 2

Text alignment (pysubs2.Alignment instance); the underlying integer uses numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics). You can also use int here, though it is discouraged.

alphalevel: int = 0

Old, unused SSA-only field

angle: float = 0.0

Rotation (ASS only)

backcolor: Color

Back, ie. shadow color (pysubs2.Color instance)

bold: bool = False

Bold

borderstyle: int = 1

Border style (1=outline, 3=box)

drawing: bool = False

Indicates that text span is a SSA vector drawing, see pysubs2.substation.parse_tags()

encoding: int = 1

Charset

fontname: str = 'Arial'

Font name

fontsize: float = 20.0

Font size (in pixels)

italic: bool = False

Italic

marginl: int = 10

Left margin (in pixels)

marginr: int = 10

Right margin (in pixels)

marginv: int = 10

Vertical margin (in pixels)

outline: float = 2.0

Outline width (in pixels)

outlinecolor: Color

Outline color (pysubs2.Color instance)

primarycolor: Color

Primary color (pysubs2.Color instance)

scalex: float = 100.0

Horizontal scaling (ASS only)

scaley: float = 100.0

Vertical scaling (ASS only)

secondarycolor: Color

Secondary color (pysubs2.Color instance)

shadow: float = 2.0

Shadow depth (in pixels)

spacing: float = 0.0

Letter spacing (ASS only)

strikeout: bool = False

Strikeout (ASS only)

tertiarycolor: Color

Tertiary color (pysubs2.Color instance)

underline: bool = False

Underline (ASS only)

pysubs2.time — time-related utilities

pysubs2.time.TIMESTAMP = re.compile('(\\d{1,2}):(\\d{1,2}):(\\d{1,2})[.,](\\d{1,3})')

Pattern that matches both SubStation and SubRip timestamps.

pysubs2.time.TIMESTAMP_SHORT = re.compile('(\\d{1,2}):(\\d{2}):(\\d{2})')

SS or HH:MM:SS timestamps.

Type:

Pattern that matches H

Type:

MM

pysubs2.time.frames_to_ms(frames: int, fps: float) int

Convert frame-based duration to milliseconds.

Parameters:
  • frames – Number of frames (should be int).

  • fps – Framerate (must be a positive number, eg. 23.976).

Returns:

Number of milliseconds (rounded to int).

Raises:

ValueError – fps was negative or zero.

pysubs2.time.make_time(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) int

Convert time to milliseconds.

See pysubs2.time.times_to_ms(). When both frames and fps are specified, pysubs2.time.frames_to_ms() is called instead.

Raises:

ValueError – Invalid fps, or one of frames/fps is missing.

Example

>>> make_time(s=1.5)
1500
>>> make_time(frames=50, fps=25)
2000
pysubs2.time.ms_to_frames(ms: int | float, fps: float) int

Convert milliseconds to number of frames.

Parameters:
  • ms – Number of milliseconds (may be int, float or other numeric class).

  • fps – Framerate (must be a positive number, eg. 23.976).

Returns:

Number of frames (int).

Raises:

ValueError – fps was negative or zero.

pysubs2.time.ms_to_str(ms: int | float, fractions: bool = False) str

Prettyprint milliseconds to [-]H:MM:SS[.mmm]

Handles huge and/or negative times. Non-negative times with fractions=True are matched by pysubs2.time.TIMESTAMP.

Parameters:
  • ms – Number of milliseconds (int, float or other numeric class).

  • fractions – Whether to print up to millisecond precision.

Returns:

str

pysubs2.time.ms_to_times(ms: int | float) Times

Convert milliseconds to normalized tuple (h, m, s, ms).

Parameters:

ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative.

Returns:

Named tuple (h, m, s, ms) of ints. Invariants: ms in range(1000) and s in range(60) and m in range(60)

pysubs2.time.times_to_ms(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0) int

Convert hours, minutes, seconds to milliseconds.

Arguments may be positive or negative, int or float, need not be normalized (s=120 is okay).

Returns:

Number of milliseconds (rounded to int).

pysubs2.time.timestamp_to_ms(groups: Sequence[str]) int

Convert groups from pysubs2.time.TIMESTAMP or pysubs2.time.TIMESTAMP_SHORT match to milliseconds.

Example

>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups())
420
>>> timestamp_to_ms(TIMESTAMP_SHORT.match("0:00:01").groups())
1000

pysubs2.exceptions — thrown exceptions

exception pysubs2.exceptions.FormatAutodetectionError(content: str, formats: List[str])

Bases: Pysubs2Error

Subtitle format is ambiguous or unknown based on analysis of file fragment

This exception is raised by SSAFile.load() and related methods when the format_ parameter is not specified. It will try to guess the input format based on reading first few kilobytes of the input file and raise this exception if the format cannot be uniquely determined.

content

Analyzed subtitle file content

Type:

str

formats

Format identifiers for detected formats

Type:

list[str]

exception pysubs2.exceptions.Pysubs2Error

Bases: Exception

Base class for pysubs2 exceptions.

exception pysubs2.exceptions.UnknownFPSError

Bases: Pysubs2Error

Framerate was not specified and couldn’t be inferred otherwise.

exception pysubs2.exceptions.UnknownFileExtensionError(ext: str)

Bases: Pysubs2Error

File extension does not pertain to any known subtitle format.

This exception is raised by SSAFile.save() when the format_ parameter is not specified. It will try to guess the desired format from output filename and raise this exception when it fails.

ext

File extension

Type:

str

exception pysubs2.exceptions.UnknownFormatIdentifierError(format_: str)

Bases: Pysubs2Error

Unknown subtitle format identifier (ie. string like "srt").

This exception is used when interpreting format_ parameter fails, eg. in SSAFile.save().

format_

Format identifier

Type:

str

pysubs2.formats — subtitle format implementations

Note

This subpackage contains pysubs2 internals. It’s mostly of interest if you’re looking to implement a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase.

pysubs2.formats.substation.parse_tags(text: str, style: ~pysubs2.ssastyle.SSAStyle = <SSAStyle 20.0px 'Arial'>, styles: ~typing.Dict[str, ~pysubs2.ssastyle.SSAStyle] | None = None) List[Tuple[str, SSAStyle]]

Split text into fragments with computed SSAStyles.

Returns list of tuples (fragment, style), where fragment is a part of text between two brace-delimited override sequences, and style is the computed styling of the fragment, ie. the original style modified by all override sequences before the fragment.

Newline and non-breakable space overrides are left as-is.

Supported override tags:

  • i, b, u, s

  • r (with or without style name)

pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER: Dict[str, str] = {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd', '.txt': 'tmp', '.vtt': 'vtt'}

Dict mapping file extensions to format identifiers.

pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS: Dict[str, Type[FormatBase]] = {'ass': <class 'pysubs2.formats.substation.SubstationFormat'>, 'json': <class 'pysubs2.formats.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.formats.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.formats.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.formats.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.formats.substation.SubstationFormat'>, 'tmp': <class 'pysubs2.formats.tmp.TmpFormat'>, 'vtt': <class 'pysubs2.formats.webvtt.WebVTTFormat'>}

Dict mapping format identifiers to implementations (FormatBase subclasses).

pysubs2.formats.autodetect_format(content: str) str

Return format identifier for given fragment or raise FormatAutodetectionError.

pysubs2.formats.get_file_extension(format_: str) str

Format identifier -> file extension

pysubs2.formats.get_format_class(format_: str) Type[FormatBase]

Format identifier -> format class (ie. subclass of FormatBase)

pysubs2.formats.get_format_identifier(ext: str) str

File extension -> format identifier

Subtitle format API

class pysubs2.formats.FormatBase

Base class for subtitle format implementations.

How to implement a new subtitle format:

  1. Create a subclass of FormatBase and override the methods you want to support.

  2. Decide on a format identifier, like the "srt" or "microdvd" already used in the library.

  3. Add your identifier and class to pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS.

  4. (optional) Add your file extension and class to pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER.

After finishing these steps, you can call SSAFile.load() and SSAFile.save() with your format, including autodetection from content and file extension (if you provided these).

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

Load subtitle file into an empty SSAFile.

If the parser autodetects framerate, set it as subs.fps.

Parameters:
  • subs (SSAFile) – An empty SSAFile.

  • fp (file object) – Text file object, the subtitle file.

  • format (str) – Format identifier. Used when one format class implements multiple formats (see SubstationFormat).

  • kwargs – Extra options, eg. fps.

Returns:

None

Raises:

pysubs2.exceptions.UnknownFPSError – Framerate was not provided and cannot be detected.

classmethod guess_format(text: str) str | None

Return format identifier of recognized format, or None.

Parameters:

text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters.

Returns:

format identifier (eg. "srt") or None (unknown format)

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

Write SSAFile into a file.

If you need framerate and it is not passed in keyword arguments, use subs.fps.

Parameters:
  • subs (SSAFile) – Subtitle file to write.

  • fp (file object) – Text file object used as output.

  • format (str) – Format identifier of desired output format. Used when one format class implements multiple formats (see SubstationFormat).

  • kwargs – Extra options, eg. fps.

Returns:

None

Raises:

pysubs2.exceptions.UnknownFPSError – Framerate was not provided and subs.fps is None.

Subtitle format implementations

Here you can find specific details regarding support of the individual subtitle formats.

Tip

Some formats support additional keyword parameters in their from_file() or to_file() methods. These are used to customize the parser/writer behaviour.

class pysubs2.formats.substation.SubstationFormat

Bases: FormatBase

SubStation Alpha (ASS, SSA) subtitle format implementation

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

static ms_to_timestamp(requested_ms: int) str

Convert ms to ‘H:MM:SS.cc’

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, header_notice: str = 'Script generated by pysubs2\nhttps://pypi.python.org/pypi/pysubs2', **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

class pysubs2.formats.subrip.SubripFormat

Bases: FormatBase

SubRip Text (SRT) subtitle format implementation

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, keep_html_tags: bool = False, keep_unknown_html_tags: bool = False, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

Supported tags:

  • <i>

  • <u>

  • <s>

  • <b>

Keyword Arguments:
  • keep_html_tags – If True, all HTML tags will be kept as-is instead of being converted to SubStation tags (eg. you will get <i>example</i> instead of {\i1}example{\i0}). Setting this to True overrides the keep_unknown_html_tags option.

  • keep_unknown_html_tags – If True, supported HTML tags will be converted to SubStation tags and any other HTML tags will be kept as-is (eg. you would get <blink>example {\i1}text{\i0}</blink>). If False, these other HTML tags will be stripped from output (in the previous example, you would get only example {\i1}text{\i0}).

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

static ms_to_timestamp(ms: int) str

Convert ms to ‘HH:MM:SS,mmm’

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, apply_styles: bool = True, keep_ssa_tags: bool = False, **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

Italic, underline and strikeout styling is supported.

Keyword Arguments:
  • apply_styles – If False, do not write any styling (ignore line style and override tags).

  • keep_ssa_tags – If True, instead of trying to convert inline override tags to HTML (as supported by SRT), any inline tags will be passed to output (eg. {\an7}, which would be otherwise stripped; or {\b1} instead of <b>). Whitespace tags \h, \n and \N will always be converted to whitespace regardless of this option. In the current implementation, enabling this option disables processing of line styles - you will get inline tags but if for example line’s style is italic you will not get {\i1} at the beginning of the line. (Since this option is mostly useful for dealing with non-standard SRT files, ie. both input and output is SRT which doesn’t use line styles - this shouldn’t be much of an issue in practice.)

class pysubs2.formats.mpl2.MPL2Format

Bases: FormatBase

MPL2 subtitle format implementation

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

No styling is supported at the moment.

class pysubs2.formats.tmp.TmpFormat

Bases: FormatBase

TMP subtitle format implementation

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

static ms_to_timestamp(ms: int) str

Convert ms to ‘HH:MM:SS’

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, apply_styles: bool = True, **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

Italic, underline and strikeout styling is supported.

Keyword Arguments:

apply_styles – If False, do not write any styling.

class pysubs2.formats.webvtt.WebVTTFormat

Bases: SubripFormat

Web Video Text Tracks (WebVTT) subtitle format implementation

Currently, this shares implementation with pysubs2.formats.subrip.SubripFormat.

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

static ms_to_timestamp(ms: int) str

Convert ms to ‘HH:MM:SS,mmm’

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.SubripFormat.to_file(), additional SRT options are supported by VTT as well

class pysubs2.formats.microdvd.MicroDVDFormat

Bases: FormatBase

MicroDVD subtitle format implementation

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, fps: float | None = None, strict_fps_inference: bool = True, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

Keyword Arguments:

strict_fps_inference – If True (default), in the case when fps is not given, it will be read from the first subtitle text only if the start and end frame of this subtitle is {1}{1} (matches VLC Player behaviour), otherwise UnknownFPSError is raised. When strict_fps_inference is False, framerate will be read from the first subtitle text in this case regardless of start and end frame (which may result in bogus result, if the first subtitle is not supposed to contain framerate). Before introduction of this option, the library behaved as if this option was False.

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, fps: float | None = None, write_fps_declaration: bool = True, apply_styles: bool = True, **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

The only supported styling is marking whole lines italic.

Keyword Arguments:
  • write_fps_declaration – If True, create a zero-duration first subtitle {1}{1} which will contain the fps.

  • apply_styles – If False, do not write any styling.

class pysubs2.formats.jsonformat.JSONFormat

Bases: FormatBase

Implementation of JSON subtitle pseudo-format (serialized pysubs2 internal representation)

This is essentially SubStation Alpha as JSON.

classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.from_file()

classmethod guess_format(text: str) str | None

See pysubs2.formats.FormatBase.guess_format()

classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None

See pysubs2.formats.FormatBase.to_file()

Misc functions

pysubs2.whisper.load_from_whisper(result_or_segments: Dict[str, Any] | List[Dict[str, Any]]) SSAFile

Load subtitle file from OpenAI Whisper transcript

Example

>>> import whisper
>>> import pysubs2
>>> model = whisper.load_model("base")
>>> result = model.transcribe("audio.mp3")
>>> subs = pysubs2.load_from_whisper(result)
>>> subs.save("audio.ass")
Parameters:

result_or_segments – Either a dict with a "segments" key that holds a list of segment dicts, or the segment list-of-dicts. Each segment is a dict with keys "start", "end" (float, timestamps in seconds) and "text" (str with caption text).

Returns:

pysubs2.SSAFile