API Reference
pysubs2
— the main module
- class pysubs2.Alignment(value)
An integer enum specifying text alignment
The integer values correspond to Advanced SubStation Alpha definition (like on numpad). Note that the older SubStation Alpha (SSA) specification used different numbering schema.
- classmethod from_ssa_alignment(alignment: int) pysubs2.common.Alignment
Convert SSA alignment to ASS alignment
- to_ssa_alignment() int
Convert ASS alignment to SSA alignment
- class pysubs2.Color(r: int, g: int, b: int, a: int = 0)
8-bit RGB color with alpha channel.
All values are ints from 0 to 255.
- pysubs2.load(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) pysubs2.ssafile.SSAFile
Alias for
SSAFile.load()
.
- pysubs2.load_from_whisper(result_or_segments: Union[Dict[str, Any], List[Dict[str, Any]]]) pysubs2.ssafile.SSAFile
Alias for
pysubs2.whisper.load_from_whisper()
.
- pysubs2.make_time(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)
Alias for
pysubs2.time.make_time()
.
SSAFile
— a subtitle file
- class pysubs2.SSAFile
Subtitle file in SubStation Alpha format.
This class has a list-like interface which exposes
SSAFile.events
, list of subtitles in the file:subs = SSAFile.load("subtitles.srt") for line in subs: print(line.text) subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle")) del subs[0]
- aegisub_project: Dict[str, str]
Dict with Aegisub project, ie.
[Aegisub Project Garbage]
.
- events: List[pysubs2.ssaevent.SSAEvent]
List of
SSAEvent
instances, ie. individual subtitles.
- fonts_opaque: Dict[str, Any]
Dict with embedded fonts, ie.
[Fonts]
.
- format: Optional[str]
Format of source subtitle file, if applicable, eg.
"srt"
.
- fps: Optional[float]
Framerate used when reading the file, if applicable.
- info: Dict[str, str]
Dict with script metadata, ie.
[Script Info]
.
- styles: Dict[str, pysubs2.ssastyle.SSAStyle]
Dict of
SSAStyle
instances.
Reading and writing subtitles
Using path to file
- classmethod SSAFile.load(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) pysubs2.ssafile.SSAFile
Load subtitle file from given path.
This method is implemented in terms of
SSAFile.from_file()
.See also
Specific formats may implement additional loading options, please refer to documentation of the implementation classes (eg.
pysubs2.subrip.SubripFormat.from_file()
)- Parameters
path (str) – Path to subtitle file.
encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.
format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
kwargs – Extra options for the reader.
- Returns
SSAFile
- Raises
IOError –
UnicodeDecodeError –
Note
pysubs2 may autodetect subtitle format and/or framerate. These values are set as
SSAFile.format
andSSAFile.fps
attributes.Example
>>> subs1 = pysubs2.load("subrip-subtitles.srt") >>> subs2 = pysubs2.load("microdvd-subtitles.sub", fps=23.976) >>> subs3 = pysubs2.load("subrip-subtitles-with-fancy-tags.srt", keep_unknown_html_tags=True)
- SSAFile.save(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs)
Save subtitle file to given path.
This method is implemented in terms of
SSAFile.to_file()
.See also
Specific formats may implement additional saving options, please refer to documentation of the implementation classes (eg.
pysubs2.subrip.SubripFormat.to_file()
)- Parameters
path (str) – Path to subtitle file.
encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.
format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. When omitted,
SSAFile.fps
value is used (ie. the framerate used for loading the file, if any). When theSSAFile
wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See alsoSSAFile.transform_framerate()
for fixing bad frame-based to time-based conversions.kwargs – Extra options for the writer.
- Raises
IOError –
UnicodeEncodeError –
Using string
- classmethod SSAFile.from_string(string: str, format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) pysubs2.ssafile.SSAFile
Load subtitle file from string.
See
SSAFile.load()
for full description.- Parameters
string (str) – Subtitle file in a string. Note that the string must be Unicode (in Python 2).
- Returns
SSAFile
Example
>>> text = ''' ... 1 ... 00:00:00,000 --> 00:00:05,000 ... An example SubRip file. ... ''' >>> subs = SSAFile.from_string(text)
- SSAFile.to_string(format_: str, fps: Optional[float] = None, **kwargs) str
Get subtitle file as a string.
See
SSAFile.save()
for full description.- Returns
str
Using file object
- classmethod SSAFile.from_file(fp: io.TextIOBase, format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) pysubs2.ssafile.SSAFile
Read subtitle file from file object.
See
SSAFile.load()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.load()
orSSAFile.from_string()
is preferable.- Parameters
fp (file object) – A file object, ie.
io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).- Returns
SSAFile
- SSAFile.to_file(fp: io.TextIOBase, format_: str, fps: Optional[float] = None, **kwargs)
Write subtitle file to file object.
See
SSAFile.save()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.save()
orSSAFile.to_string()
is preferable.- Parameters
fp (file object) – A file object, ie.
io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).
Retiming subtitles
- SSAFile.shift(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)
Shift all subtitles by constant time amount.
Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.
- Parameters
h – Integer or float values, may be positive or negative.
m – Integer or float values, may be positive or negative.
s – Integer or float values, may be positive or negative.
ms – Integer or float values, may be positive or negative.
frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.
fps (float) – When specified, must be a positive number.
- Raises
ValueError – Invalid fps or missing number of frames.
- SSAFile.transform_framerate(in_fps: float, out_fps: float)
Rescale all timestamps by ratio of in_fps/out_fps.
Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.
- Parameters
in_fps (float) –
out_fps (float) –
- Raises
ValueError – Non-positive framerate given.
Working with styles
- SSAFile.rename_style(old_name: str, new_name: str)
Rename a style, including references to it.
- Parameters
old_name (str) – Style to be renamed.
new_name (str) – New name for the style (must be unused).
- Raises
KeyError – No style named old_name.
ValueError – new_name is not a legal name (cannot use commas) or new_name is taken.
- SSAFile.import_styles(subs: pysubs2.ssafile.SSAFile, overwrite: bool = True)
Merge in styles from other SSAFile.
- Parameters
subs (SSAFile) – Subtitle file imported from.
overwrite (bool) – On name conflict, use style from the other file (default: True).
Misc methods
- SSAFile.remove_miscellaneous_events()
Remove subtitles which appear to be non-essential (the –clean in CLI)
Currently, this removes events matching any of these criteria: - SSA event type Comment - SSA drawing tags - Less than two characters of text - Duplicated text with identical time interval (only the first event is kept)
- SSAFile.equals(other: pysubs2.ssafile.SSAFile)
Equality of two SSAFiles.
Compares
SSAFile.info
,SSAFile.styles
andSSAFile.events
. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.Useful mostly in unit tests. Differences are logged at DEBUG level.
- SSAFile.sort()
Sort subtitles time-wise, in-place.
SSAEvent
— one subtitle
- class pysubs2.SSAEvent(start: int = 0, end: int = 10000, text: str = '', marked: bool = False, layer: int = 0, style: str = 'Default', name: str = '', marginl: int = 0, marginr: int = 0, marginv: int = 0, effect: str = '', type: str = 'Dialogue')
A SubStation Event, ie. one subtitle.
In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see
SSAEvent.FIELDS
for a list). Additionaly, there are some convenience properties likeSSAEvent.plaintext
orSSAEvent.duration
.This class defines an ordering with respect to (start, end) timestamps.
Tip
Use
pysubs2.make_time()
to get times in milliseconds.Example:
>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
- property FIELDS
All fields in SSAEvent.
- copy() pysubs2.ssaevent.SSAEvent
Return a copy of the SSAEvent.
- property duration: Union[int, float]
Subtitle duration in milliseconds (read/write property).
Writing to this property adjusts
SSAEvent.end
. Setting negative durations raisesValueError
.
- effect: str = ''
Line effect
- end: int = 10000
Subtitle end time (in milliseconds)
- equals(other: pysubs2.ssaevent.SSAEvent) bool
Field-based equality for SSAEvents.
- property is_comment: bool
When true, the subtitle is a comment, ie. not visible (read/write property).
Setting this property is equivalent to changing
SSAEvent.type
to"Dialogue"
or"Comment"
.
- property is_drawing: bool
Returns True if line is SSA drawing tag (ie. not text)
- layer: int = 0
Layer number, 0 is the lowest layer (ASS only)
- marginl: int = 0
Left margin
- marginr: int = 0
Right margin
- marginv: int = 0
Vertical margin
- marked: bool = False
(SSA only)
- name: str = ''
Actor name
- property plaintext: str
Subtitle text as multi-line string with no tags (read/write property).
Writing to this property replaces
SSAEvent.text
with given plain text. Newlines are converted to\N
tags.
- shift(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)
Shift start and end times.
See
SSAFile.shift()
for full description.
- start: int = 0
Subtitle start time (in milliseconds)
- style: str = 'Default'
Style name
- text: str = ''
Text of subtitle (with SubStation override tags)
- type: str = 'Dialogue'
Line type (Dialogue/Comment)
SSAStyle
— a subtitle style
- class pysubs2.SSAStyle(fontname: str = 'Arial', fontsize: float = 20.0, primarycolor: pysubs2.common.Color = <factory>, secondarycolor: pysubs2.common.Color = <factory>, tertiarycolor: pysubs2.common.Color = <factory>, outlinecolor: pysubs2.common.Color = <factory>, backcolor: pysubs2.common.Color = <factory>, bold: bool = False, italic: bool = False, underline: bool = False, strikeout: bool = False, scalex: float = 100.0, scaley: float = 100.0, spacing: float = 0.0, angle: float = 0.0, borderstyle: int = 1, outline: float = 2.0, shadow: float = 2.0, alignment: pysubs2.common.Alignment = Alignment.BOTTOM_CENTER, marginl: int = 10, marginr: int = 10, marginv: int = 10, alphalevel: int = 0, encoding: int = 1, drawing: bool = False)
A SubStation Style.
In SubStation, each subtitle (
SSAEvent
) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; seeSSAStyle.FIELDS
for a list (note the spelling, which is different from SubStation proper).Subtitles and styles are connected via an
SSAFile
they belong to.SSAEvent.style
is a string which is (or should be) a key in theSSAFile.styles
dict. Note that style name is stored separately; a givenSSAStyle
instance has no particular name itself.This class defines equality (equality of all fields).
- property FIELDS
All fields in SSAStyle.
- alignment: pysubs2.common.Alignment = 2
Text alignment (
pysubs2.Alignment
instance); the underlying integer uses numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics). You can also useint
here, though it is discouraged.
- alphalevel: int = 0
Old, unused SSA-only field
- angle: float = 0.0
Rotation (ASS only)
- backcolor: pysubs2.common.Color
Back, ie. shadow color (
pysubs2.Color
instance)
- bold: bool = False
Bold
- borderstyle: int = 1
Border style (1=outline, 3=box)
- drawing: bool = False
Indicates that text span is a SSA vector drawing, see
pysubs2.substation.parse_tags()
- encoding: int = 1
Charset
- fontname: str = 'Arial'
Font name
- fontsize: float = 20.0
Font size (in pixels)
- italic: bool = False
Italic
- marginl: int = 10
Left margin (in pixels)
- marginr: int = 10
Right margin (in pixels)
- marginv: int = 10
Vertical margin (in pixels)
- outline: float = 2.0
Outline width (in pixels)
- outlinecolor: pysubs2.common.Color
Outline color (
pysubs2.Color
instance)
- primarycolor: pysubs2.common.Color
Primary color (
pysubs2.Color
instance)
- scalex: float = 100.0
Horizontal scaling (ASS only)
- scaley: float = 100.0
Vertical scaling (ASS only)
- secondarycolor: pysubs2.common.Color
Secondary color (
pysubs2.Color
instance)
- shadow: float = 2.0
Shadow depth (in pixels)
- spacing: float = 0.0
Letter spacing (ASS only)
- strikeout: bool = False
Strikeout (ASS only)
- tertiarycolor: pysubs2.common.Color
Tertiary color (
pysubs2.Color
instance)
- underline: bool = False
Underline (ASS only)
pysubs2.time
— time-related utilities
- pysubs2.time.TIMESTAMP = re.compile('(\\d{1,2}):(\\d{1,2}):(\\d{1,2})[.,](\\d{1,3})')
Pattern that matches both SubStation and SubRip timestamps.
- pysubs2.time.TIMESTAMP_SHORT = re.compile('(\\d{1,2}):(\\d{2}):(\\d{2})')
SS or HH:MM:SS timestamps.
- Type
Pattern that matches H
- Type
MM
- pysubs2.time.frames_to_ms(frames: int, fps: float) int
Convert frame-based duration to milliseconds.
- Parameters
frames – Number of frames (should be int).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns
Number of milliseconds (rounded to int).
- Raises
ValueError – fps was negative or zero.
- pysubs2.time.make_time(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)
Convert time to milliseconds.
See
pysubs2.time.times_to_ms()
. When both frames and fps are specified,pysubs2.time.frames_to_ms()
is called instead.- Raises
ValueError – Invalid fps, or one of frames/fps is missing.
Example
>>> make_time(s=1.5) 1500 >>> make_time(frames=50, fps=25) 2000
- pysubs2.time.ms_to_frames(ms: Union[int, float], fps: float) int
Convert milliseconds to number of frames.
- Parameters
ms – Number of milliseconds (may be int, float or other numeric class).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns
Number of frames (int).
- Raises
ValueError – fps was negative or zero.
- pysubs2.time.ms_to_str(ms: Union[int, float], fractions: bool = False) str
Prettyprint milliseconds to [-]H:MM:SS[.mmm]
Handles huge and/or negative times. Non-negative times with
fractions=True
are matched bypysubs2.time.TIMESTAMP
.- Parameters
ms – Number of milliseconds (int, float or other numeric class).
fractions – Whether to print up to millisecond precision.
- Returns
str
- pysubs2.time.ms_to_times(ms: Union[int, float]) Tuple[int, int, int, int]
Convert milliseconds to normalized tuple (h, m, s, ms).
- Parameters
ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative.
- Returns
Named tuple (h, m, s, ms) of ints. Invariants:
ms in range(1000) and s in range(60) and m in range(60)
- pysubs2.time.times_to_ms(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0) int
Convert hours, minutes, seconds to milliseconds.
Arguments may be positive or negative, int or float, need not be normalized (
s=120
is okay).- Returns
Number of milliseconds (rounded to int).
- pysubs2.time.timestamp_to_ms(groups: Sequence[str])
Convert groups from
pysubs2.time.TIMESTAMP
orpysubs2.time.TIMESTAMP_SHORT
match to milliseconds.Example
>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups()) 420 >>> timestamp_to_ms(TIMESTAMP_SHORT.match("0:00:01").groups()) 1000
pysubs2.exceptions
— thrown exceptions
- exception pysubs2.exceptions.ContentNotUsable
Current content not usable for specified format
- exception pysubs2.exceptions.FormatAutodetectionError
Subtitle format is ambiguous or unknown.
- exception pysubs2.exceptions.Pysubs2Error
Base class for pysubs2 exceptions.
- exception pysubs2.exceptions.UnknownFPSError
Framerate was not specified and couldn’t be inferred otherwise.
- exception pysubs2.exceptions.UnknownFileExtensionError
File extension does not pertain to any known subtitle format.
- exception pysubs2.exceptions.UnknownFormatIdentifierError
Unknown subtitle format identifier (ie. string like
"srt"
).
pysubs2.formats
— subtitle format implementations
Note
This submodule contains pysubs2 internals. It’s mostly of interest if you’re looking to implement
a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase
.
- pysubs2.substation.parse_tags(text: str, style: pysubs2.ssastyle.SSAStyle = <SSAStyle 20.0px 'Arial'>, styles: Optional[Dict[str, pysubs2.ssastyle.SSAStyle]] = None)
Split text into fragments with computed SSAStyles.
Returns list of tuples (fragment, style), where fragment is a part of text between two brace-delimited override sequences, and style is the computed styling of the fragment, ie. the original style modified by all override sequences before the fragment.
Newline and non-breakable space overrides are left as-is.
Supported override tags:
i, b, u, s
r (with or without style name)
- pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER: Dict[str, str] = {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd', '.txt': 'tmp', '.vtt': 'vtt'}
Dict mapping file extensions to format identifiers.
- pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS: Dict[str, Type[pysubs2.formatbase.FormatBase]] = {'ass': <class 'pysubs2.substation.SubstationFormat'>, 'json': <class 'pysubs2.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.substation.SubstationFormat'>, 'tmp': <class 'pysubs2.tmp.TmpFormat'>, 'vtt': <class 'pysubs2.webvtt.WebVTTFormat'>}
Dict mapping format identifiers to implementations (FormatBase subclasses).
- pysubs2.formats.autodetect_format(content: str) str
Return format identifier for given fragment or raise FormatAutodetectionError.
- pysubs2.formats.get_file_extension(format_: str) str
Format identifier -> file extension
- pysubs2.formats.get_format_class(format_: str) Type[pysubs2.formatbase.FormatBase]
Format identifier -> format class (ie. subclass of FormatBase)
- pysubs2.formats.get_format_identifier(ext: str) str
File extension -> format identifier
Subtitle format API
- class pysubs2.formats.FormatBase
Base class for subtitle format implementations.
How to implement a new subtitle format:
Create a subclass of FormatBase and override the methods you want to support.
Decide on a format identifier, like the
"srt"
or"microdvd"
already used in the library.Add your identifier and class to
pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS
.(optional) Add your file extension and class to
pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER
.
After finishing these steps, you can call
SSAFile.load()
andSSAFile.save()
with your format, including autodetection from content and file extension (if you provided these).- classmethod from_file(subs, fp: io.TextIOBase, format_: str, **kwargs)
Load subtitle file into an empty SSAFile.
If the parser autodetects framerate, set it as
subs.fps
.- Parameters
subs (SSAFile) – An empty
SSAFile
.fp (file object) – Text file object, the subtitle file.
format (str) – Format identifier. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns
None
- Raises
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and cannot be detected.
- classmethod guess_format(text: str) Optional[str]
Return format identifier of recognized format, or None.
- Parameters
text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters.
- Returns
format identifier (eg.
"srt"
) or None (unknown format)
- classmethod to_file(subs, fp: io.TextIOBase, format_: str, **kwargs)
Write SSAFile into a file.
If you need framerate and it is not passed in keyword arguments, use
subs.fps
.- Parameters
subs (SSAFile) – Subtitle file to write.
fp (file object) – Text file object used as output.
format (str) – Format identifier of desired output format. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns
None
- Raises
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and
subs.fps is None
.
Subtitle format implementations
Here you can find specific details regarding support of the individual subtitle formats.
Tip
Some formats support additional keyword parameters in their from_file()
or to_file()
methods.
These are used to customize the parser/writer behaviour.
- class pysubs2.substation.SubstationFormat
SubStation Alpha (ASS, SSA) subtitle format implementation
- classmethod from_file(subs: pysubs2.ssafile.SSAFile, fp, format_, **kwargs)
- classmethod guess_format(text)
- static ms_to_timestamp(ms: int) str
Convert ms to ‘H:MM:SS.cc’
- classmethod to_file(subs: pysubs2.ssafile.SSAFile, fp, format_, header_notice='Script generated by pysubs2\nhttps://pypi.python.org/pypi/pysubs2', **kwargs)
- pysubs2.substation.parse_tags(text: str, style: pysubs2.ssastyle.SSAStyle = <SSAStyle 20.0px 'Arial'>, styles: Optional[Dict[str, pysubs2.ssastyle.SSAStyle]] = None)
Split text into fragments with computed SSAStyles.
Returns list of tuples (fragment, style), where fragment is a part of text between two brace-delimited override sequences, and style is the computed styling of the fragment, ie. the original style modified by all override sequences before the fragment.
Newline and non-breakable space overrides are left as-is.
Supported override tags:
i, b, u, s
r (with or without style name)
- class pysubs2.subrip.SubripFormat
SubRip Text (SRT) subtitle format implementation
- classmethod from_file(subs, fp, format_, keep_html_tags=False, keep_unknown_html_tags=False, **kwargs)
See
pysubs2.formats.FormatBase.from_file()
Supported tags:
<i>
<u>
<s>
<b>
- Keyword Arguments
keep_html_tags – If True, all HTML tags will be kept as-is instead of being converted to SubStation tags (eg. you will get
<i>example</i>
instead of{\i1}example{\i0}
). Setting this to True overrides thekeep_unknown_html_tags
option.keep_unknown_html_tags – If True, supported HTML tags will be converted to SubStation tags and any other HTML tags will be kept as-is (eg. you would get
<blink>example {\i1}text{\i0}</blink>
). If False, these other HTML tags will be stripped from output (in the previous example, you would get onlyexample {\i1}text{\i0}
).
- classmethod guess_format(text)
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS,mmm’
- classmethod to_file(subs, fp, format_, apply_styles=True, keep_ssa_tags=False, **kwargs)
See
pysubs2.formats.FormatBase.to_file()
Italic, underline and strikeout styling is supported.
- Keyword Arguments
apply_styles – If False, do not write any styling (ignore line style and override tags).
keep_ssa_tags – If True, instead of trying to convert inline override tags to HTML (as supported by SRT), any inline tags will be passed to output (eg.
{\an7}
, which would be otherwise stripped; or{\b1}
instead of<b>
). Whitespace tags\h
,\n
and\N
will always be converted to whitespace regardless of this option. In the current implementation, enabling this option disables processing of line styles - you will get inline tags but if for example line’s style is italic you will not get{\i1}
at the beginning of the line. (Since this option is mostly useful for dealing with non-standard SRT files, ie. both input and output is SRT which doesn’t use line styles - this shouldn’t be much of an issue in practice.)
- class pysubs2.mpl2.MPL2Format
MPL2 subtitle format implementation
- classmethod from_file(subs, fp, format_, **kwargs)
- classmethod guess_format(text)
- classmethod to_file(subs, fp, format_, **kwargs)
See
pysubs2.formats.FormatBase.to_file()
No styling is supported at the moment.
- class pysubs2.tmp.TmpFormat
TMP subtitle format implementation
- classmethod from_file(subs, fp, format_, **kwargs)
- classmethod guess_format(text)
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS’
- classmethod to_file(subs, fp, format_, apply_styles=True, **kwargs)
See
pysubs2.formats.FormatBase.to_file()
Italic, underline and strikeout styling is supported.
- Keyword Arguments
apply_styles – If False, do not write any styling.
- class pysubs2.webvtt.WebVTTFormat
Web Video Text Tracks (WebVTT) subtitle format implementation
Currently, this shares implementation with
pysubs2.subrip.SubripFormat
.- classmethod guess_format(text)
- static ms_to_timestamp(ms: int) str
Convert ms to ‘HH:MM:SS,mmm’
- classmethod to_file(subs, fp, format_, **kwargs)
- class pysubs2.microdvd.MicroDVDFormat
MicroDVD subtitle format implementation
- classmethod from_file(subs, fp, format_, fps=None, **kwargs)
- classmethod guess_format(text)
- classmethod to_file(subs, fp, format_, fps=None, write_fps_declaration=True, apply_styles=True, **kwargs)
See
pysubs2.formats.FormatBase.to_file()
The only supported styling is marking whole lines italic.
- Keyword Arguments
write_fps_declaration – If True, create a zero-duration first subtitle which will contain the fps.
apply_styles – If False, do not write any styling.
- class pysubs2.jsonformat.JSONFormat
Implementation of JSON subtitle pseudo-format (serialized pysubs2 internal representation)
This is essentially SubStation Alpha as JSON.
- classmethod from_file(subs, fp, format_, **kwargs)
- classmethod guess_format(text)
- classmethod to_file(subs, fp, format_, **kwargs)
Misc functions
- pysubs2.whisper.load_from_whisper(result_or_segments: Union[Dict[str, Any], List[Dict[str, Any]]]) pysubs2.ssafile.SSAFile
Load subtitle file from OpenAI Whisper transcript
Example
>>> import whisper >>> import pysubs2 >>> model = whisper.load_model("base") >>> result = model.transcribe("audio.mp3") >>> subs = pysubs2.load_from_whisper(result) >>> subs.save("audio.ass")
See also
- Parameters
result_or_segments – Either a dict with a
"segments"
key that holds a list of segment dicts, or the segment list-of-dicts. Each segment is a dict with keys"start"
,"end"
(float, timestamps in seconds) and"text"
(str with caption text).- Returns