API Reference
pysubs2
— the main module
- class pysubs2.Color(r: int, g: int, b: int, a: int = 0)
8-bit RGB color with alpha channel.
All values are ints from 0 to 255.
- pysubs2.load(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) SSAFile
Alias for
SSAFile.load()
.
- pysubs2.load_from_whisper(result_or_segments: Dict[str, Any] | List[Dict[str, Any]]) SSAFile
Alias for
pysubs2.whisper.load_from_whisper()
.
- pysubs2.make_time(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) int
Alias for
pysubs2.time.make_time()
.
- enum pysubs2.Alignment(value)
An integer enum specifying text alignment
The integer values correspond to Advanced SubStation Alpha definition (like on numpad). Note that the older SubStation Alpha (SSA) specification used different numbering schema.
- Member Type:
int
Valid values are as follows:
- BOTTOM_LEFT = <Alignment.BOTTOM_LEFT: 1>
- BOTTOM_CENTER = <Alignment.BOTTOM_CENTER: 2>
- BOTTOM_RIGHT = <Alignment.BOTTOM_RIGHT: 3>
- MIDDLE_LEFT = <Alignment.MIDDLE_LEFT: 4>
- MIDDLE_CENTER = <Alignment.MIDDLE_CENTER: 5>
- MIDDLE_RIGHT = <Alignment.MIDDLE_RIGHT: 6>
- TOP_LEFT = <Alignment.TOP_LEFT: 7>
- TOP_CENTER = <Alignment.TOP_CENTER: 8>
- TOP_RIGHT = <Alignment.TOP_RIGHT: 9>
SSAFile
— a subtitle file
- class pysubs2.SSAFile
Subtitle file in SubStation Alpha format.
This class has a list-like interface which exposes
SSAFile.events
, list of subtitles in the file:subs = SSAFile.load("subtitles.srt") for line in subs: print(line.text) subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle")) del subs[0]
- aegisub_project: Dict[str, str]
Dict with Aegisub project, ie.
[Aegisub Project Garbage]
.
- fonts_opaque: Dict[str, Any]
Dict with embedded fonts, ie.
[Fonts]
.
- format: str | None
Format of source subtitle file, if applicable, eg.
"srt"
.
- fps: float | None
Framerate used when reading the file, if applicable.
- info: Dict[str, str]
Dict with script metadata, ie.
[Script Info]
.
Reading and writing subtitles
Using path to file
- classmethod SSAFile.load(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) SSAFile
Load subtitle file from given path.
This method is implemented in terms of
SSAFile.from_file()
.See also
Specific formats may implement additional loading options, please refer to documentation of the implementation classes (eg.
pysubs2.formats.subrip.SubripFormat.from_file()
)- Parameters:
path (str) – Path to subtitle file.
encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.
errors (Optional[str]) –
Error handling for character encoding of input file. Defaults to
"surrogateescape"
. See documentation of builtinopen()
function for more.Changed in version 2.0.0: The
errors
parameter was introduced to facilitate pass-through of subtitle files with unknown text encoding. Previous versions of the library behaved as iferrors=None
.format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
kwargs – Extra options for the reader.
- Returns:
SSAFile
- Raises:
IOError –
UnicodeDecodeError –
Note
pysubs2 may autodetect subtitle format and/or framerate. These values are set as
SSAFile.format
andSSAFile.fps
attributes.Example
>>> subs1 = pysubs2.load("subrip-subtitles.srt") >>> subs2 = pysubs2.load("microdvd-subtitles.sub",fps=23.976) >>> subs3 = pysubs2.load("subrip-subtitles-with-fancy-tags.srt",keep_unknown_html_tags=True)
- SSAFile.save(path: str, encoding: str = 'utf-8', format_: str | None = None, fps: float | None = None, errors: str | None = 'surrogateescape', **kwargs: Any) None
Save subtitle file to given path.
This method is implemented in terms of
SSAFile.to_file()
.See also
Specific formats may implement additional saving options, please refer to documentation of the implementation classes (eg.
pysubs2.formats.subrip.SubripFormat.to_file()
)- Parameters:
path (str) – Path to subtitle file.
encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.
format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. When omitted,
SSAFile.fps
value is used (ie. the framerate used for loading the file, if any). When theSSAFile
wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See alsoSSAFile.transform_framerate()
for fixing bad frame-based to time-based conversions.errors (Optional[str]) –
Error handling for character encoding, defaults to
"surrogateescape"
. See documentation of builtinopen()
function for more.Changed in version 2.0.0: The
errors
parameter was introduced to facilitate pass-through of subtitle files with unknown text encoding. Previous versions of the library behaved as iferrors=None
.kwargs – Extra options for the writer.
- Raises:
IOError –
UnicodeEncodeError –
Using string
- classmethod SSAFile.from_string(string: str, format_: str | None = None, fps: float | None = None, **kwargs: Any) SSAFile
Load subtitle file from string.
See
SSAFile.load()
for full description.- Parameters:
string (str) – Subtitle file in a string. Note that the string must be Unicode (
str
, notbytes
).format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
- Returns:
SSAFile
Example
>>> text = ''' ... 1 ... 00:00:00,000 --> 00:00:05,000 ... An example SubRip file. ... ''' >>> subs = SSAFile.from_string(text)
- SSAFile.to_string(format_: str, fps: float | None = None, **kwargs: Any) str
Get subtitle file as a string.
See
SSAFile.save()
for full description.- Returns:
str
Using file object
- classmethod SSAFile.from_file(fp: TextIO, format_: str | None = None, fps: float | None = None, **kwargs: Any) SSAFile
Read subtitle file from file object.
See
SSAFile.load()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.load()
orSSAFile.from_string()
is preferable.- Parameters:
fp (file object) – A file object, ie.
TextIO
instance. Note that the file must be opened in text mode (as opposed to binary).format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
- Returns:
SSAFile
- SSAFile.to_file(fp: TextIO, format_: str, fps: float | None = None, **kwargs: Any) None
Write subtitle file to file object.
See
SSAFile.save()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.save()
orSSAFile.to_string()
is preferable.- Parameters:
fp (file object) – A file object, ie.
TextIO
instance. Note that the file must be opened in text mode (as opposed to binary).
Retiming subtitles
- SSAFile.shift(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) None
Shift all subtitles by constant time amount.
Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.
- Parameters:
h – Integer or float values, may be positive or negative (hours).
m – Integer or float values, may be positive or negative (minutes).
s – Integer or float values, may be positive or negative (seconds).
ms – Integer or float values, may be positive or negative (milliseconds).
frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.
fps (float) – When specified, must be a positive number.
- Raises:
ValueError – Invalid fps or missing number of frames.
- SSAFile.transform_framerate(in_fps: float, out_fps: float) None
Rescale all timestamps by ratio of in_fps/out_fps.
Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.
- Parameters:
in_fps (float)
out_fps (float)
- Raises:
ValueError – Non-positive framerate given.
Working with styles
- SSAFile.rename_style(old_name: str, new_name: str) None
Rename a style, including references to it.
- Parameters:
old_name (str) – Style to be renamed.
new_name (str) – New name for the style (must be unused).
- Raises:
KeyError – No style named old_name.
ValueError – new_name is not a legal name (cannot use commas) or new_name is taken.
Misc methods
- SSAFile.remove_miscellaneous_events() None
Remove subtitles which appear to be non-essential (the –clean in CLI)
Currently, this removes events matching any of these criteria: - SSA event type Comment - SSA drawing tags - Less than two characters of text - Duplicated text with identical time interval (only the first event is kept)
- SSAFile.equals(other: SSAFile) bool
Equality of two SSAFiles.
Compares
SSAFile.info
,SSAFile.styles
andSSAFile.events
. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.Useful mostly in unit tests. Differences are logged at DEBUG level.
- SSAFile.sort() None
Sort subtitles time-wise, in-place.
SSAEvent
— one subtitle
- class pysubs2.SSAEvent(start: int = 0, end: int = 10000, text: str = '', marked: bool = False, layer: int = 0, style: str = 'Default', name: str = '', marginl: int = 0, marginr: int = 0, marginv: int = 0, effect: str = '', type: str = 'Dialogue')
A SubStation Event, ie. one subtitle.
In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see
SSAEvent.FIELDS
for a list). Additionaly, there are some convenience properties likeSSAEvent.plaintext
orSSAEvent.duration
.This class defines an ordering with respect to (start, end) timestamps.
Tip
Use
pysubs2.make_time()
to get times in milliseconds.Example:
>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
- property FIELDS: FrozenSet[str]
All fields in SSAEvent.
- property duration: int | float
Subtitle duration in milliseconds (read/write property).
Writing to this property adjusts
SSAEvent.end
. Setting negative durations raisesValueError
.
- effect: str = ''
Line effect
- end: int = 10000
Subtitle end time (in milliseconds)
- property is_comment: bool
When true, the subtitle is a comment, ie. not visible (read/write property).
Setting this property is equivalent to changing
SSAEvent.type
to"Dialogue"
or"Comment"
.
- property is_drawing: bool
Returns True if line is SSA drawing tag (ie. not text)
- property is_text: bool
Returns False for SSA drawings and comment lines, True otherwise
In general, for non-SSA formats these events should be ignored.
- layer: int = 0
Layer number, 0 is the lowest layer (ASS only)
- marginl: int = 0
Left margin
- marginr: int = 0
Right margin
- marginv: int = 0
Vertical margin
- marked: bool = False
(SSA only)
- name: str = ''
Actor name
- property plaintext: str
Subtitle text as multi-line string with no tags (read/write property).
Writing to this property replaces
SSAEvent.text
with given plain text. Newlines are converted to\N
tags.
- shift(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) None
Shift start and end times.
See
SSAFile.shift()
for full description.
- start: int = 0
Subtitle start time (in milliseconds)
- style: str = 'Default'
Style name
- text: str = ''
Text of subtitle (with SubStation override tags)
- type: str = 'Dialogue'
Line type (Dialogue/Comment)
SSAStyle
— a subtitle style
- class pysubs2.SSAStyle(fontname: str = 'Arial', fontsize: float = 20.0, primarycolor: ~pysubs2.common.Color = <factory>, secondarycolor: ~pysubs2.common.Color = <factory>, tertiarycolor: ~pysubs2.common.Color = <factory>, outlinecolor: ~pysubs2.common.Color = <factory>, backcolor: ~pysubs2.common.Color = <factory>, bold: bool = False, italic: bool = False, underline: bool = False, strikeout: bool = False, scalex: float = 100.0, scaley: float = 100.0, spacing: float = 0.0, angle: float = 0.0, borderstyle: int = 1, outline: float = 2.0, shadow: float = 2.0, alignment: ~pysubs2.common.Alignment = Alignment.BOTTOM_CENTER, marginl: int = 10, marginr: int = 10, marginv: int = 10, alphalevel: int = 0, encoding: int = 1, drawing: bool = False)
A SubStation Style.
In SubStation, each subtitle (
SSAEvent
) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; seeSSAStyle.FIELDS
for a list (note the spelling, which is different from SubStation proper).Subtitles and styles are connected via an
SSAFile
they belong to.SSAEvent.style
is a string which is (or should be) a key in theSSAFile.styles
dict. Note that style name is stored separately; a givenSSAStyle
instance has no particular name itself.This class defines equality (equality of all fields).
- property FIELDS: FrozenSet[str]
All fields in SSAStyle.
- alignment: Alignment = 2
Text alignment (
pysubs2.Alignment
instance); the underlying integer uses numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics). You can also useint
here, though it is discouraged.
- alphalevel: int = 0
Old, unused SSA-only field
- angle: float = 0.0
Rotation (ASS only)
- backcolor: Color
Back, ie. shadow color (
pysubs2.Color
instance)
- bold: bool = False
Bold
- borderstyle: int = 1
Border style (1=outline, 3=box)
- drawing: bool = False
Indicates that text span is a SSA vector drawing, see
pysubs2.substation.parse_tags()
- encoding: int = 1
Charset
- fontname: str = 'Arial'
Font name
- fontsize: float = 20.0
Font size (in pixels)
- italic: bool = False
Italic
- marginl: int = 10
Left margin (in pixels)
- marginr: int = 10
Right margin (in pixels)
- marginv: int = 10
Vertical margin (in pixels)
- outline: float = 2.0
Outline width (in pixels)
- outlinecolor: Color
Outline color (
pysubs2.Color
instance)
- primarycolor: Color
Primary color (
pysubs2.Color
instance)
- scalex: float = 100.0
Horizontal scaling (ASS only)
- scaley: float = 100.0
Vertical scaling (ASS only)
- secondarycolor: Color
Secondary color (
pysubs2.Color
instance)
- shadow: float = 2.0
Shadow depth (in pixels)
- spacing: float = 0.0
Letter spacing (ASS only)
- strikeout: bool = False
Strikeout (ASS only)
- tertiarycolor: Color
Tertiary color (
pysubs2.Color
instance)
- underline: bool = False
Underline (ASS only)
pysubs2.time
— time-related utilities
- pysubs2.time.TIMESTAMP = re.compile('(\\d{1,2}):(\\d{1,2}):(\\d{1,2})[.,](\\d{1,3})')
Pattern that matches both SubStation and SubRip timestamps.
- pysubs2.time.TIMESTAMP_SHORT = re.compile('(\\d{1,2}):(\\d{2}):(\\d{2})')
SS or HH:MM:SS timestamps.
- Type:
Pattern that matches H
- Type:
MM
- pysubs2.time.frames_to_ms(frames: int, fps: float) int
Convert frame-based duration to milliseconds.
- Parameters:
frames – Number of frames (should be int).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns:
Number of milliseconds (rounded to int).
- Raises:
ValueError – fps was negative or zero.
- pysubs2.time.make_time(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0, frames: int | None = None, fps: float | None = None) int
Convert time to milliseconds.
See
pysubs2.time.times_to_ms()
. When both frames and fps are specified,pysubs2.time.frames_to_ms()
is called instead.- Raises:
ValueError – Invalid fps, or one of frames/fps is missing.
Example
>>> make_time(s=1.5) 1500 >>> make_time(frames=50, fps=25) 2000
- pysubs2.time.ms_to_frames(ms: int | float, fps: float) int
Convert milliseconds to number of frames.
- Parameters:
ms – Number of milliseconds (may be int, float or other numeric class).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns:
Number of frames (int).
- Raises:
ValueError – fps was negative or zero.
- pysubs2.time.ms_to_str(ms: int | float, fractions: bool = False) str
Prettyprint milliseconds to [-]H:MM:SS[.mmm]
Handles huge and/or negative times. Non-negative times with
fractions=True
are matched bypysubs2.time.TIMESTAMP
.- Parameters:
ms – Number of milliseconds (int, float or other numeric class).
fractions – Whether to print up to millisecond precision.
- Returns:
str
- pysubs2.time.ms_to_times(ms: int | float) Times
Convert milliseconds to normalized tuple (h, m, s, ms).
- Parameters:
ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative.
- Returns:
Named tuple (h, m, s, ms) of ints. Invariants:
ms in range(1000) and s in range(60) and m in range(60)
- pysubs2.time.times_to_ms(h: int | float = 0, m: int | float = 0, s: int | float = 0, ms: int | float = 0) int
Convert hours, minutes, seconds to milliseconds.
Arguments may be positive or negative, int or float, need not be normalized (
s=120
is okay).- Returns:
Number of milliseconds (rounded to int).
- pysubs2.time.timestamp_to_ms(groups: Sequence[str]) int
Convert groups from
pysubs2.time.TIMESTAMP
orpysubs2.time.TIMESTAMP_SHORT
match to milliseconds.Example
>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups()) 420 >>> timestamp_to_ms(TIMESTAMP_SHORT.match("0:00:01").groups()) 1000
pysubs2.exceptions
— thrown exceptions
- exception pysubs2.exceptions.FormatAutodetectionError(content: str, formats: List[str])
Bases:
Pysubs2Error
Subtitle format is ambiguous or unknown based on analysis of file fragment
This exception is raised by SSAFile.load() and related methods when the
format_
parameter is not specified. It will try to guess the input format based on reading first few kilobytes of the input file and raise this exception if the format cannot be uniquely determined.- content
Analyzed subtitle file content
- Type:
str
- formats
Format identifiers for detected formats
- Type:
list[str]
- exception pysubs2.exceptions.Pysubs2Error
Bases:
Exception
Base class for pysubs2 exceptions.
- exception pysubs2.exceptions.UnknownFPSError
Bases:
Pysubs2Error
Framerate was not specified and couldn’t be inferred otherwise.
- exception pysubs2.exceptions.UnknownFileExtensionError(ext: str)
Bases:
Pysubs2Error
File extension does not pertain to any known subtitle format.
This exception is raised by SSAFile.save() when the
format_
parameter is not specified. It will try to guess the desired format from output filename and raise this exception when it fails.- ext
File extension
- Type:
str
- exception pysubs2.exceptions.UnknownFormatIdentifierError(format_: str)
Bases:
Pysubs2Error
Unknown subtitle format identifier (ie. string like
"srt"
).This exception is used when interpreting
format_
parameter fails, eg. in SSAFile.save().- format_
Format identifier
- Type:
str
pysubs2.formats
— subtitle format implementations
Note
This subpackage contains pysubs2 internals. It’s mostly of interest if you’re looking to implement
a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase
.
- pysubs2.formats.substation.parse_tags(text: str, style: ~pysubs2.ssastyle.SSAStyle = <SSAStyle 20.0px 'Arial'>, styles: ~typing.Dict[str, ~pysubs2.ssastyle.SSAStyle] | None = None) List[Tuple[str, SSAStyle]]
Split text into fragments with computed SSAStyles.
Returns list of tuples (fragment, style), where fragment is a part of text between two brace-delimited override sequences, and style is the computed styling of the fragment, ie. the original style modified by all override sequences before the fragment.
Newline and non-breakable space overrides are left as-is.
Supported override tags:
i, b, u, s
r (with or without style name)
- pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER: Dict[str, str] = {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd', '.txt': 'tmp', '.vtt': 'vtt'}
Dict mapping file extensions to format identifiers.
- pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS: Dict[str, Type[FormatBase]] = {'ass': <class 'pysubs2.formats.substation.SubstationFormat'>, 'json': <class 'pysubs2.formats.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.formats.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.formats.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.formats.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.formats.substation.SubstationFormat'>, 'tmp': <class 'pysubs2.formats.tmp.TmpFormat'>, 'vtt': <class 'pysubs2.formats.webvtt.WebVTTFormat'>}
Dict mapping format identifiers to implementations (FormatBase subclasses).
- pysubs2.formats.autodetect_format(content: str) str
Return format identifier for given fragment or raise FormatAutodetectionError.
- pysubs2.formats.get_file_extension(format_: str) str
Format identifier -> file extension
- pysubs2.formats.get_format_class(format_: str) Type[FormatBase]
Format identifier -> format class (ie. subclass of FormatBase)
- pysubs2.formats.get_format_identifier(ext: str) str
File extension -> format identifier
Subtitle format API
- class pysubs2.formats.FormatBase
Base class for subtitle format implementations.
How to implement a new subtitle format:
Create a subclass of FormatBase and override the methods you want to support.
Decide on a format identifier, like the
"srt"
or"microdvd"
already used in the library.Add your identifier and class to
pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS
.(optional) Add your file extension and class to
pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER
.
After finishing these steps, you can call
SSAFile.load()
andSSAFile.save()
with your format, including autodetection from content and file extension (if you provided these).- classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None
Load subtitle file into an empty SSAFile.
If the parser autodetects framerate, set it as
subs.fps
.- Parameters:
subs (SSAFile) – An empty
SSAFile
.fp (file object) – Text file object, the subtitle file.
format (str) – Format identifier. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns:
None
- Raises:
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and cannot be detected.
- classmethod guess_format(text: str) str | None
Return format identifier of recognized format, or None.
- Parameters:
text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters.
- Returns:
format identifier (eg.
"srt"
) or None (unknown format)
- classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None
Write SSAFile into a file.
If you need framerate and it is not passed in keyword arguments, use
subs.fps
.- Parameters:
subs (SSAFile) – Subtitle file to write.
fp (file object) – Text file object used as output.
format (str) – Format identifier of desired output format. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns:
None
- Raises:
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and
subs.fps is None
.
Subtitle format implementations
Here you can find specific details regarding support of the individual subtitle formats.
Tip
Some formats support additional keyword parameters in their from_file()
or to_file()
methods.
These are used to customize the parser/writer behaviour.
- class pysubs2.formats.substation.SubstationFormat
Bases:
FormatBase
SubStation Alpha (ASS, SSA) subtitle format implementation
- classmethod guess_format(text: str) str | None
- static ms_to_timestamp(requested_ms: int) str
Convert ms to ‘H:MM:SS.cc’
- class pysubs2.formats.subrip.SubripFormat
Bases:
FormatBase
SubRip Text (SRT) subtitle format implementation
- classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, keep_html_tags: bool = False, keep_unknown_html_tags: bool = False, **kwargs: Any) None
See
pysubs2.formats.FormatBase.from_file()
Supported tags:
<i>
<u>
<s>
<b>
- Keyword Arguments:
keep_html_tags – If True, all HTML tags will be kept as-is instead of being converted to SubStation tags (eg. you will get
<i>example</i>
instead of{\i1}example{\i0}
). Setting this to True overrides thekeep_unknown_html_tags
option.keep_unknown_html_tags – If True, supported HTML tags will be converted to SubStation tags and any other HTML tags will be kept as-is (eg. you would get
<blink>example {\i1}text{\i0}</blink>
). If False, these other HTML tags will be stripped from output (in the previous example, you would get onlyexample {\i1}text{\i0}
).
- classmethod guess_format(text: str) str | None
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS,mmm’
- classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, apply_styles: bool = True, keep_ssa_tags: bool = False, **kwargs: Any) None
See
pysubs2.formats.FormatBase.to_file()
Italic, underline and strikeout styling is supported.
- Keyword Arguments:
apply_styles – If False, do not write any styling (ignore line style and override tags).
keep_ssa_tags – If True, instead of trying to convert inline override tags to HTML (as supported by SRT), any inline tags will be passed to output (eg.
{\an7}
, which would be otherwise stripped; or{\b1}
instead of<b>
). Whitespace tags\h
,\n
and\N
will always be converted to whitespace regardless of this option. In the current implementation, enabling this option disables processing of line styles - you will get inline tags but if for example line’s style is italic you will not get{\i1}
at the beginning of the line. (Since this option is mostly useful for dealing with non-standard SRT files, ie. both input and output is SRT which doesn’t use line styles - this shouldn’t be much of an issue in practice.)
- class pysubs2.formats.mpl2.MPL2Format
Bases:
FormatBase
MPL2 subtitle format implementation
- classmethod guess_format(text: str) str | None
- classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, **kwargs: Any) None
See
pysubs2.formats.FormatBase.to_file()
No styling is supported at the moment.
- class pysubs2.formats.tmp.TmpFormat
Bases:
FormatBase
TMP subtitle format implementation
- classmethod guess_format(text: str) str | None
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS’
- classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, apply_styles: bool = True, **kwargs: Any) None
See
pysubs2.formats.FormatBase.to_file()
Italic, underline and strikeout styling is supported.
- Keyword Arguments:
apply_styles – If False, do not write any styling.
- class pysubs2.formats.webvtt.WebVTTFormat
Bases:
SubripFormat
Web Video Text Tracks (WebVTT) subtitle format implementation
Currently, this shares implementation with
pysubs2.formats.subrip.SubripFormat
.- classmethod guess_format(text: str) str | None
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS,mmm’
- class pysubs2.formats.microdvd.MicroDVDFormat
Bases:
FormatBase
MicroDVD subtitle format implementation
- classmethod from_file(subs: SSAFile, fp: TextIO, format_: str, fps: float | None = None, strict_fps_inference: bool = True, **kwargs: Any) None
See
pysubs2.formats.FormatBase.from_file()
- Keyword Arguments:
strict_fps_inference – If True (default), in the case when
fps
is not given, it will be read from the first subtitle text only if the start and end frame of this subtitle is{1}{1}
(matches VLC Player behaviour), otherwise UnknownFPSError is raised. Whenstrict_fps_inference
is False, framerate will be read from the first subtitle text in this case regardless of start and end frame (which may result in bogus result, if the first subtitle is not supposed to contain framerate). Before introduction of this option, the library behaved as if this option was False.
- classmethod guess_format(text: str) str | None
- classmethod to_file(subs: SSAFile, fp: TextIO, format_: str, fps: float | None = None, write_fps_declaration: bool = True, apply_styles: bool = True, **kwargs: Any) None
See
pysubs2.formats.FormatBase.to_file()
The only supported styling is marking whole lines italic.
- Keyword Arguments:
write_fps_declaration – If True, create a zero-duration first subtitle
{1}{1}
which will contain the fps.apply_styles – If False, do not write any styling.
- class pysubs2.formats.jsonformat.JSONFormat
Bases:
FormatBase
Implementation of JSON subtitle pseudo-format (serialized pysubs2 internal representation)
This is essentially SubStation Alpha as JSON.
- classmethod guess_format(text: str) str | None
Misc functions
- pysubs2.whisper.load_from_whisper(result_or_segments: Dict[str, Any] | List[Dict[str, Any]]) SSAFile
Load subtitle file from OpenAI Whisper transcript
Example
>>> import whisper >>> import pysubs2 >>> model = whisper.load_model("base") >>> result = model.transcribe("audio.mp3") >>> subs = pysubs2.load_from_whisper(result) >>> subs.save("audio.ass")
See also
- Parameters:
result_or_segments – Either a dict with a
"segments"
key that holds a list of segment dicts, or the segment list-of-dicts. Each segment is a dict with keys"start"
,"end"
(float, timestamps in seconds) and"text"
(str with caption text).- Returns: