API Reference

Note

The documentation is written from Python 3 point of view; a “string” means Unicode string.

Supported input/output formats

pysubs2 is built around SubStation Alpha, the native subtitle format of Aegisub.

SubStation Alpha — supported in two versions:

  • .ass files (Advanced SubStation Alpha v4.0+), format identifier is "ass".
  • .ssa files (SubStation Alpha v4.0), format identifier is "ssa".

SubRip — .srt files, format identifier is "srt".

MicroDVD — .sub files, format identifier is "microdvd".

MPL2 — Time-based format similar to MicroDVD, format identifier is "mpl2". To save subtitles in MPL2 format, use subs.save("subtitles.txt", format_="mpl2").

JSON-serialized internal representation, which amounts to ASS. Format identifier is "json".

pysubs2 — the main module

pysubs2.load = <bound method SSAFile.load of <class 'pysubs2.ssafile.SSAFile'>>

Alias for SSAFile.load().

pysubs2.make_time(h=0, m=0, s=0, ms=0, frames=None, fps=None)

Alias for pysubs2.time.make_time().

class pysubs2.Color

(r, g, b, a) namedtuple for 8-bit RGB color with alpha channel.

All values are ints from 0 to 255.

SSAFile — a subtitle file

class pysubs2.SSAFile

Subtitle file in SubStation Alpha format.

This class has a list-like interface which exposes SSAFile.events, list of subtitles in the file:

subs = SSAFile.load("subtitles.srt")

for line in subs:
    print(line.text)

subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle"))

del subs[0]
aegisub_project = None

Dict with Aegisub project, ie. [Aegisub Project Garbage].

events = None

List of SSAEvent instances, ie. individual subtitles.

format = None

Format of source subtitle file, if applicable, eg. "srt".

fps = None

Framerate used when reading the file, if applicable.

info = None

Dict with script metadata, ie. [Script Info].

styles = None

Dict of SSAStyle instances.

Reading and writing subtitles

Using path to file

classmethod SSAFile.load(path, encoding='utf-8', format_=None, fps=None, **kwargs)

Load subtitle file from given path.

Parameters:
  • path (str) – Path to subtitle file.
  • encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.
  • format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
  • kwargs – Extra options for the parser.
Returns:

SSAFile

Raises:

Note

pysubs2 may autodetect subtitle format and/or framerate. These values are set as SSAFile.format and SSAFile.fps attributes.

Example

>>> subs1 = pysubs2.load("subrip-subtitles.srt")
>>> subs2 = pysubs2.load("microdvd-subtitles.sub", fps=23.976)
SSAFile.save(path, encoding='utf-8', format_=None, fps=None, **kwargs)

Save subtitle file to given path.

Parameters:
  • path (str) – Path to subtitle file.
  • encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.
  • format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.
  • fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. When omitted, SSAFile.fps value is used (ie. the framerate used for loading the file, if any). When the SSAFile wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See also SSAFile.transform_framerate() for fixing bad frame-based to time-based conversions.
  • kwargs – Extra options for the writer.
Raises:

Using string

classmethod SSAFile.from_string(string, format_=None, fps=None, **kwargs)

Load subtitle file from string.

See SSAFile.load() for full description.

Parameters:string (str) – Subtitle file in a string. Note that the string must be Unicode (in Python 2).
Returns:SSAFile

Example

>>> text = '''
... 1
... 00:00:00,000 --> 00:00:05,000
... An example SubRip file.
... '''
>>> subs = SSAFile.from_string(text)
SSAFile.to_string(format_, fps=None, **kwargs)

Get subtitle file as a string.

See SSAFile.save() for full description.

Returns:str

Using file object

classmethod SSAFile.from_file(fp, format_=None, fps=None, **kwargs)

Read subtitle file from file object.

See SSAFile.load() for full description.

Note

This is a low-level method. Usually, one of SSAFile.load() or SSAFile.from_string() is preferable.

Parameters:fp (file object) – A file object, ie. io.TextIOBase instance. Note that the file must be opened in text mode (as opposed to binary).
Returns:SSAFile
SSAFile.to_file(fp, format_, fps=None, **kwargs)

Write subtitle file to file object.

See SSAFile.save() for full description.

Note

This is a low-level method. Usually, one of SSAFile.save() or SSAFile.to_string() is preferable.

Parameters:fp (file object) – A file object, ie. io.TextIOBase instance. Note that the file must be opened in text mode (as opposed to binary).

Retiming subtitles

SSAFile.shift(h=0, m=0, s=0, ms=0, frames=None, fps=None)

Shift all subtitles by constant time amount.

Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.

Parameters:
  • m, s, ms (h,) – Integer or float values, may be positive or negative.
  • frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.
  • fps (float) – When specified, must be a positive number.
Raises:

ValueError – Invalid fps or missing number of frames.

SSAFile.transform_framerate(in_fps, out_fps)

Rescale all timestamps by ratio of in_fps/out_fps.

Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.

Parameters:
  • in_fps (float) –
  • out_fps (float) –
Raises:

ValueError – Non-positive framerate given.

Working with styles

SSAFile.rename_style(old_name, new_name)

Rename a style, including references to it.

Parameters:
  • old_name (str) – Style to be renamed.
  • new_name (str) – New name for the style (must be unused).
Raises:
  • KeyError – No style named old_name.
  • ValueError – new_name is not a legal name (cannot use commas) or new_name is taken.
SSAFile.import_styles(subs, overwrite=True)

Merge in styles from other SSAFile.

Parameters:
  • subs (SSAFile) – Subtitle file imported from.
  • overwrite (bool) – On name conflict, use style from the other file (default: True).

Misc methods

SSAFile.equals(other)

Equality of two SSAFiles.

Compares SSAFile.info, SSAFile.styles and SSAFile.events. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.

Useful mostly in unit tests. Differences are logged at DEBUG level.

SSAFile.sort()

Sort subtitles time-wise, in-place.

SSAEvent — one subtitle

class pysubs2.SSAEvent(**fields)

A SubStation Event, ie. one subtitle.

In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see SSAEvent.FIELDS for a list). Additionaly, there are some convenience properties like SSAEvent.plaintext or SSAEvent.duration.

This class defines an ordering with respect to (start, end) timestamps.

Tip

Use pysubs2.make_time() to get times in milliseconds.

Example:

>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
FIELDS = frozenset({'layer', 'start', 'marginl', 'marginv', 'type', 'marginr', 'style', 'effect', 'text', 'end', 'name', 'marked'})

All fields in SSAEvent.

copy()

Return a copy of the SSAEvent.

duration

Subtitle duration in milliseconds (read/write property).

Writing to this property adjusts SSAEvent.end. Setting negative durations raises ValueError.

effect = None

Line effect

end = None

Subtitle end time (in milliseconds)

equals(other)

Field-based equality for SSAEvents.

is_comment

When true, the subtitle is a comment, ie. not visible (read/write property).

Setting this property is equivalent to changing SSAEvent.type to "Dialogue" or "Comment".

layer = None

Layer number, 0 is the lowest layer (ASS only)

marginl = None

Left margin

marginr = None

Right margin

marginv = None

Vertical margin

marked = None

(SSA only)

name = None

Actor name

plaintext

Subtitle text as multi-line string with no tags (read/write property).

Writing to this property replaces SSAEvent.text with given plain text. Newlines are converted to \N tags.

shift(h=0, m=0, s=0, ms=0, frames=None, fps=None)

Shift start and end times.

See SSAFile.shift() for full description.

start = None

Subtitle start time (in milliseconds)

style = None

Style name

text = None

Text of subtitle (with SubStation override tags)

type = None

Line type (Dialogue/Comment)

SSAStyle — a subtitle style

class pysubs2.SSAStyle(**fields)

A SubStation Style.

In SubStation, each subtitle (SSAEvent) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; see SSAStyle.FIELDS for a list (note the spelling, which is different from SubStation proper).

Subtitles and styles are connected via an SSAFile they belong to. SSAEvent.style is a string which is (or should be) a key in the SSAFile.styles dict. Note that style name is stored separately; a given SSAStyle instance has no particular name itself.

This class defines equality (equality of all fields).

FIELDS = frozenset({'marginv', 'encoding', 'primarycolor', 'strikeout', 'borderstyle', 'alignment', 'alphalevel', 'marginl', 'italic', 'fontname', 'underline', 'secondarycolor', 'backcolor', 'outline', 'scaley', 'bold', 'angle', 'outlinecolor', 'spacing', 'marginr', 'scalex', 'fontsize', 'tertiarycolor', 'shadow'})

All fields in SSAStyle.

alignment = None

Numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics)

alphalevel = None

Old, unused SSA-only field

angle = None

Rotation (ASS only)

backcolor = None

Back, ie. shadow color (pysubs2.Color instance)

bold = None

Bold

borderstyle = None

Border style

encoding = None

Charset

fontname = None

Font name

fontsize = None

Font size (in pixels)

italic = None

Italic

marginl = None

Left margin (in pixels)

marginr = None

Right margin (in pixels)

marginv = None

Vertical margin (in pixels)

outline = None

Outline width (in pixels)

outlinecolor = None

Outline color (pysubs2.Color instance)

primarycolor = None

Primary color (pysubs2.Color instance)

scalex = None

Horizontal scaling (ASS only)

scaley = None

Vertical scaling (ASS only)

secondarycolor = None

Secondary color (pysubs2.Color instance)

shadow = None

Shadow depth (in pixels)

spacing = None

Letter spacing (ASS only)

strikeout = None

Strikeout (ASS only)

tertiarycolor = None

Tertiary color (pysubs2.Color instance)

underline = None

Underline (ASS only)

pysubs2.time — time-related utilities

pysubs2.time.TIMESTAMP = re.compile('(\\d{1,2}):(\\d{2}):(\\d{2})[.,](\\d{2,3})')

Pattern that matches both SubStation and SubRip timestamps.

pysubs2.time.frames_to_ms(frames, fps)

Convert frame-based duration to milliseconds.

Parameters:
  • frames – Number of frames (should be int).
  • fps – Framerate (must be a positive number, eg. 23.976).
Returns:

Number of milliseconds (rounded to int).

Raises:

ValueError – fps was negative or zero.

pysubs2.time.make_time(h=0, m=0, s=0, ms=0, frames=None, fps=None)

Convert time to milliseconds.

See pysubs2.time.times_to_ms(). When both frames and fps are specified, pysubs2.time.frames_to_ms() is called instead.

Raises:ValueError – Invalid fps, or one of frames/fps is missing.

Example

>>> make_time(s=1.5)
1500
>>> make_time(frames=50, fps=25)
2000
pysubs2.time.ms_to_frames(ms, fps)

Convert milliseconds to number of frames.

Parameters:
  • ms – Number of milliseconds (may be int, float or other numeric class).
  • fps – Framerate (must be a positive number, eg. 23.976).
Returns:

Number of frames (int).

Raises:

ValueError – fps was negative or zero.

pysubs2.time.ms_to_str(ms, fractions=False)

Prettyprint milliseconds to [-]H:MM:SS[.mmm]

Handles huge and/or negative times. Non-negative times with fractions=True are matched by pysubs2.time.TIMESTAMP.

Parameters:
  • ms – Number of milliseconds (int, float or other numeric class).
  • fractions – Whether to print up to millisecond precision.
Returns:

str

pysubs2.time.ms_to_times(ms)

Convert milliseconds to normalized tuple (h, m, s, ms).

Parameters:ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative.
Returns:Named tuple (h, m, s, ms) of ints. Invariants: ms in range(1000) and s in range(60) and m in range(60)
pysubs2.time.times_to_ms(h=0, m=0, s=0, ms=0)

Convert hours, minutes, seconds to milliseconds.

Arguments may be positive or negative, int or float, need not be normalized (s=120 is okay).

Returns:Number of milliseconds (rounded to int).
pysubs2.time.timestamp_to_ms(groups)

Convert groups from pysubs2.time.TIMESTAMP match to milliseconds.

Example

>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups())
420

pysubs2.exceptions — thrown exceptions

exception pysubs2.exceptions.FormatAutodetectionError

Subtitle format is ambiguous or unknown.

exception pysubs2.exceptions.Pysubs2Error

Base class for pysubs2 exceptions.

exception pysubs2.exceptions.UnknownFPSError

Framerate was not specified and couldn’t be inferred otherwise.

exception pysubs2.exceptions.UnknownFileExtensionError

File extension does not pertain to any known subtitle format.

exception pysubs2.exceptions.UnknownFormatIdentifierError

Unknown subtitle format identifier (ie. string like "srt").

pysubs2.formats — subtitle format implementations

Note

This submodule contains pysubs2 internals. It’s mostly of interest if you’re looking to implement a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase.

pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER = {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd'}

Dict mapping file extensions to format identifiers.

pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS = {'ass': <class 'pysubs2.substation.SubstationFormat'>, 'json': <class 'pysubs2.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.substation.SubstationFormat'>}

Dict mapping format identifiers to implementations (FormatBase subclasses).

pysubs2.formats.autodetect_format(content)

Return format identifier for given fragment or raise FormatAutodetectionError.

pysubs2.formats.get_file_extension(format_)

Format identifier -> file extension

pysubs2.formats.get_format_class(format_)

Format identifier -> format class (ie. subclass of FormatBase)

pysubs2.formats.get_format_identifier(ext)

File extension -> format identifier

class pysubs2.formats.FormatBase

Base class for subtitle format implementations.

How to implement a new subtitle format:

  1. Create a subclass of FormatBase and override the methods you want to support.
  2. Decide on a format identifier, like the "srt" or "microdvd" already used in the library.
  3. Add your identifier and class to pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS.
  4. (optional) Add your file extension and class to pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER.

After finishing these steps, you can call SSAFile.load() and SSAFile.save() with your format, including autodetection from content and file extension (if you provided these).

classmethod from_file(subs, fp, format_, **kwargs)

Load subtitle file into an empty SSAFile.

If the parser autodetects framerate, set it as subs.fps.

Parameters:
  • subs (SSAFile) – An empty SSAFile.
  • fp (file object) – Text file object, the subtitle file.
  • format (str) – Format identifier. Used when one format class implements multiple formats (see SubstationFormat).
  • kwargs – Extra options, eg. fps.
Returns:

None

Raises:

pysubs2.exceptions.UnknownFPSError – Framerate was not provided and cannot be detected.

classmethod guess_format(text)

Return format identifier of recognized format, or None.

Parameters:text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters.
Returns:format identifier (eg. "srt") or None (unknown format)
classmethod to_file(subs, fp, format_, **kwargs)

Write SSAFile into a file.

If you need framerate and it is not passed in keyword arguments, use subs.fps.

Parameters:
  • subs (SSAFile) – Subtitle file to write.
  • fp (file object) – Text file object used as output.
  • format (str) – Format identifier of desired output format. Used when one format class implements multiple formats (see SubstationFormat).
  • kwargs – Extra options, eg. fps.
Returns:

None

Raises:

pysubs2.exceptions.UnknownFPSError – Framerate was not provided and subs.fps is None.