API Reference¶
pysubs2
— the main module¶
-
class
pysubs2.
Color
(r: int, g: int, b: int, a: int = 0)¶ (r, g, b, a) namedtuple for 8-bit RGB color with alpha channel.
All values are ints from 0 to 255.
-
pysubs2.
load
(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) → pysubs2.ssafile.SSAFile¶ Alias for
SSAFile.load()
.
-
pysubs2.
make_time
(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)¶ Alias for
pysubs2.time.make_time()
.
SSAFile
— a subtitle file¶
-
class
pysubs2.
SSAFile
¶ Subtitle file in SubStation Alpha format.
This class has a list-like interface which exposes
SSAFile.events
, list of subtitles in the file:subs = SSAFile.load("subtitles.srt") for line in subs: print(line.text) subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle")) del subs[0]
-
aegisub_project
: Dict[str, str]¶ Dict with Aegisub project, ie.
[Aegisub Project Garbage]
.
-
format
: Optional[str]¶ Format of source subtitle file, if applicable, eg.
"srt"
.
-
fps
: Optional[float]¶ Framerate used when reading the file, if applicable.
-
info
: Dict[str, str]¶ Dict with script metadata, ie.
[Script Info]
.
-
Reading and writing subtitles¶
Using path to file¶
-
classmethod
SSAFile.
load
(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) → pysubs2.ssafile.SSAFile¶ Load subtitle file from given path.
- Parameters
path (str) – Path to subtitle file.
encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.
format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
keep_unknown_html_tags (bool) – This affects SubRip only (SRT), for other formats this argument is ignored. By default, HTML tags are converted to equivalent SubStation tags (eg.
<i>
to{\i1}
and any remaining tags are removed to keep the text clean. Set this parameter toTrue
if you want to pass through these tags (eg.<sub>
). This is useful if your output format is SRT and your player supports these tags.
- Returns
SSAFile
- Raises
IOError –
UnicodeDecodeError –
Note
pysubs2 may autodetect subtitle format and/or framerate. These values are set as
SSAFile.format
andSSAFile.fps
attributes.Example
>>> subs1 = pysubs2.load("subrip-subtitles.srt") >>> subs2 = pysubs2.load("microdvd-subtitles.sub", fps=23.976) >>> subs3 = pysubs2.load("subrip-subtitles-with-fancy-tags.srt", keep_unknown_html_tags=True)
-
SSAFile.
save
(path: str, encoding: str = 'utf-8', format_: Optional[str] = None, fps: Optional[float] = None, **kwargs)¶ Save subtitle file to given path.
- Parameters
path (str) – Path to subtitle file.
encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.
format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.
fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. When omitted,
SSAFile.fps
value is used (ie. the framerate used for loading the file, if any). When theSSAFile
wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See alsoSSAFile.transform_framerate()
for fixing bad frame-based to time-based conversions.kwargs – Extra options for the writer.
- Raises
IOError –
UnicodeEncodeError –
Using string¶
-
classmethod
SSAFile.
from_string
(string: str, format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) → pysubs2.ssafile.SSAFile¶ Load subtitle file from string.
See
SSAFile.load()
for full description.- Parameters
string (str) – Subtitle file in a string. Note that the string must be Unicode (in Python 2).
- Returns
SSAFile
Example
>>> text = ''' ... 1 ... 00:00:00,000 --> 00:00:05,000 ... An example SubRip file. ... ''' >>> subs = SSAFile.from_string(text)
-
SSAFile.
to_string
(format_: str, fps: Optional[float] = None, **kwargs) → str¶ Get subtitle file as a string.
See
SSAFile.save()
for full description.- Returns
str
Using file object¶
-
classmethod
SSAFile.
from_file
(fp: io.TextIOBase, format_: Optional[str] = None, fps: Optional[float] = None, **kwargs) → pysubs2.ssafile.SSAFile¶ Read subtitle file from file object.
See
SSAFile.load()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.load()
orSSAFile.from_string()
is preferable.- Parameters
fp (file object) – A file object, ie.
io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).- Returns
SSAFile
-
SSAFile.
to_file
(fp: io.TextIOBase, format_: str, fps: Optional[float] = None, **kwargs)¶ Write subtitle file to file object.
See
SSAFile.save()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.save()
orSSAFile.to_string()
is preferable.- Parameters
fp (file object) – A file object, ie.
io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).
Retiming subtitles¶
-
SSAFile.
shift
(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)¶ Shift all subtitles by constant time amount.
Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.
- Parameters
h – Integer or float values, may be positive or negative.
m – Integer or float values, may be positive or negative.
s – Integer or float values, may be positive or negative.
ms – Integer or float values, may be positive or negative.
frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.
fps (float) – When specified, must be a positive number.
- Raises
ValueError – Invalid fps or missing number of frames.
-
SSAFile.
transform_framerate
(in_fps: float, out_fps: float)¶ Rescale all timestamps by ratio of in_fps/out_fps.
Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.
- Parameters
in_fps (float) –
out_fps (float) –
- Raises
ValueError – Non-positive framerate given.
Working with styles¶
-
SSAFile.
rename_style
(old_name: str, new_name: str)¶ Rename a style, including references to it.
- Parameters
old_name (str) – Style to be renamed.
new_name (str) – New name for the style (must be unused).
- Raises
KeyError – No style named old_name.
ValueError – new_name is not a legal name (cannot use commas) or new_name is taken.
Misc methods¶
-
SSAFile.
remove_miscellaneous_events
()¶ Remove subtitles which appear to be non-essential (the –clean in CLI)
Currently, this removes events matching any of these criteria: - SSA event type Comment - SSA drawing tags - Less than two characters of text - Duplicated text with identical time interval (only the first event is kept)
-
SSAFile.
equals
(other: pysubs2.ssafile.SSAFile)¶ Equality of two SSAFiles.
Compares
SSAFile.info
,SSAFile.styles
andSSAFile.events
. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.Useful mostly in unit tests. Differences are logged at DEBUG level.
-
SSAFile.
sort
()¶ Sort subtitles time-wise, in-place.
SSAEvent
— one subtitle¶
-
class
pysubs2.
SSAEvent
(start: int = 0, end: int = 10000, text: str = '', marked: bool = False, layer: int = 0, style: str = 'Default', name: str = '', marginl: int = 0, marginr: int = 0, marginv: int = 0, effect: str = '', type: str = 'Dialogue')¶ A SubStation Event, ie. one subtitle.
In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see
SSAEvent.FIELDS
for a list). Additionaly, there are some convenience properties likeSSAEvent.plaintext
orSSAEvent.duration
.This class defines an ordering with respect to (start, end) timestamps.
Tip
Use
pysubs2.make_time()
to get times in milliseconds.Example:
>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
-
FIELDS
= frozenset({'effect', 'end', 'layer', 'marginl', 'marginr', 'marginv', 'marked', 'name', 'start', 'style', 'text', 'type'})¶ All fields in SSAEvent.
-
copy
() → pysubs2.ssaevent.SSAEvent¶ Return a copy of the SSAEvent.
-
property
duration
¶ Subtitle duration in milliseconds (read/write property).
Writing to this property adjusts
SSAEvent.end
. Setting negative durations raisesValueError
.
-
effect
: str¶ Line effect
-
end
: int¶ Subtitle end time (in milliseconds)
-
equals
(other: pysubs2.ssaevent.SSAEvent) → bool¶ Field-based equality for SSAEvents.
-
property
is_comment
¶ When true, the subtitle is a comment, ie. not visible (read/write property).
Setting this property is equivalent to changing
SSAEvent.type
to"Dialogue"
or"Comment"
.
-
property
is_drawing
¶ Returns True if line is SSA drawing tag (ie. not text)
-
layer
: int¶ Layer number, 0 is the lowest layer (ASS only)
-
marginl
: int¶ Left margin
-
marginr
: int¶ Right margin
-
marginv
: int¶ Vertical margin
-
marked
: bool¶ (SSA only)
-
name
: str¶ Actor name
-
property
plaintext
¶ Subtitle text as multi-line string with no tags (read/write property).
Writing to this property replaces
SSAEvent.text
with given plain text. Newlines are converted to\N
tags.
-
shift
(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)¶ Shift start and end times.
See
SSAFile.shift()
for full description.
-
start
: int¶ Subtitle start time (in milliseconds)
-
style
: str¶ Style name
-
text
: str¶ Text of subtitle (with SubStation override tags)
-
type
: str¶ Line type (Dialogue/Comment)
-
SSAStyle
— a subtitle style¶
-
class
pysubs2.
SSAStyle
(fontname: str = 'Arial', fontsize: float = 20.0, primarycolor: pysubs2.common.Color = Color(r=255, g=255, b=255, a=0), secondarycolor: pysubs2.common.Color = Color(r=255, g=0, b=0, a=0), tertiarycolor: pysubs2.common.Color = Color(r=0, g=0, b=0, a=0), outlinecolor: pysubs2.common.Color = Color(r=0, g=0, b=0, a=0), backcolor: pysubs2.common.Color = Color(r=0, g=0, b=0, a=0), bold: bool = False, italic: bool = False, underline: bool = False, strikeout: bool = False, scalex: float = 100.0, scaley: float = 100.0, spacing: float = 0.0, angle: float = 0.0, borderstyle: int = 1, outline: float = 2.0, shadow: float = 2.0, alignment: int = 2, marginl: int = 10, marginr: int = 10, marginv: int = 10, alphalevel: int = 0, encoding: int = 1)¶ A SubStation Style.
In SubStation, each subtitle (
SSAEvent
) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; seeSSAStyle.FIELDS
for a list (note the spelling, which is different from SubStation proper).Subtitles and styles are connected via an
SSAFile
they belong to.SSAEvent.style
is a string which is (or should be) a key in theSSAFile.styles
dict. Note that style name is stored separately; a givenSSAStyle
instance has no particular name itself.This class defines equality (equality of all fields).
-
FIELDS
= frozenset({'alignment', 'alphalevel', 'angle', 'backcolor', 'bold', 'borderstyle', 'encoding', 'fontname', 'fontsize', 'italic', 'marginl', 'marginr', 'marginv', 'outline', 'outlinecolor', 'primarycolor', 'scalex', 'scaley', 'secondarycolor', 'shadow', 'spacing', 'strikeout', 'tertiarycolor', 'underline'})¶ All fields in SSAStyle.
-
alignment
: int¶ Numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics)
-
alphalevel
: int¶ Old, unused SSA-only field
-
angle
: float¶ Rotation (ASS only)
-
backcolor
: Color¶ Back, ie. shadow color (
pysubs2.Color
instance)
-
bold
: bool¶ Bold
-
borderstyle
: int¶ Border style
-
drawing
: bool¶ //docs.aegisub.org/3.1/ASS_Tags/#drawing-tags)
- Type
Drawing (ASS only override tag, see http
-
encoding
: int¶ Charset
-
fontname
: str¶ Font name
-
fontsize
: float¶ Font size (in pixels)
-
italic
: bool¶ Italic
-
marginl
: int¶ Left margin (in pixels)
-
marginr
: int¶ Right margin (in pixels)
-
marginv
: int¶ Vertical margin (in pixels)
-
outline
: float¶ Outline width (in pixels)
-
outlinecolor
: Color¶ Outline color (
pysubs2.Color
instance)
-
primarycolor
: Color¶ Primary color (
pysubs2.Color
instance)
-
scalex
: float¶ Horizontal scaling (ASS only)
-
scaley
: float¶ Vertical scaling (ASS only)
-
secondarycolor
: Color¶ Secondary color (
pysubs2.Color
instance)
-
shadow
: float¶ Shadow depth (in pixels)
-
spacing
: float¶ Letter spacing (ASS only)
-
strikeout
: bool¶ Strikeout (ASS only)
-
tertiarycolor
: Color¶ Tertiary color (
pysubs2.Color
instance)
-
underline
: bool¶ Underline (ASS only)
-
pysubs2.time
— time-related utilities¶
-
pysubs2.time.
frames_to_ms
(frames: int, fps: float) → int¶ Convert frame-based duration to milliseconds.
- Parameters
frames – Number of frames (should be int).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns
Number of milliseconds (rounded to int).
- Raises
ValueError – fps was negative or zero.
-
pysubs2.time.
make_time
(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0, frames: Optional[int] = None, fps: Optional[float] = None)¶ Convert time to milliseconds.
See
pysubs2.time.times_to_ms()
. When both frames and fps are specified,pysubs2.time.frames_to_ms()
is called instead.- Raises
ValueError – Invalid fps, or one of frames/fps is missing.
Example
>>> make_time(s=1.5) 1500 >>> make_time(frames=50, fps=25) 2000
-
pysubs2.time.
ms_to_frames
(ms: Union[int, float], fps: float) → int¶ Convert milliseconds to number of frames.
- Parameters
ms – Number of milliseconds (may be int, float or other numeric class).
fps – Framerate (must be a positive number, eg. 23.976).
- Returns
Number of frames (int).
- Raises
ValueError – fps was negative or zero.
-
pysubs2.time.
ms_to_str
(ms: Union[int, float], fractions: bool = False) → str¶ Prettyprint milliseconds to [-]H:MM:SS[.mmm]
Handles huge and/or negative times. Non-negative times with
fractions=True
are matched bypysubs2.time.TIMESTAMP
.- Parameters
ms – Number of milliseconds (int, float or other numeric class).
fractions – Whether to print up to millisecond precision.
- Returns
str
-
pysubs2.time.
ms_to_times
(ms: Union[int, float]) → Tuple[int, int, int, int]¶ Convert milliseconds to normalized tuple (h, m, s, ms).
- Parameters
ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative.
- Returns
Named tuple (h, m, s, ms) of ints. Invariants:
ms in range(1000) and s in range(60) and m in range(60)
-
pysubs2.time.
times_to_ms
(h: Union[int, float] = 0, m: Union[int, float] = 0, s: Union[int, float] = 0, ms: Union[int, float] = 0) → int¶ Convert hours, minutes, seconds to milliseconds.
Arguments may be positive or negative, int or float, need not be normalized (
s=120
is okay).- Returns
Number of milliseconds (rounded to int).
-
pysubs2.time.
timestamp_to_ms
(groups: Sequence[str])¶ Convert groups from
pysubs2.time.TIMESTAMP
match to milliseconds.Example
>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups()) 420
-
pysubs2.time.
tmptimestamp_to_ms
(groups: Sequence[str])¶ Convert groups from
pysubs2.time.TMPTIMESTAMP
match to milliseconds.Example
>>> timestamp_to_ms(TIMESTAMP.match("0:00:01").groups()) 1000
pysubs2.exceptions
— thrown exceptions¶
-
exception
pysubs2.exceptions.
ContentNotUsable
¶ Current content not usable for specified format
-
exception
pysubs2.exceptions.
FormatAutodetectionError
¶ Subtitle format is ambiguous or unknown.
-
exception
pysubs2.exceptions.
Pysubs2Error
¶ Base class for pysubs2 exceptions.
-
exception
pysubs2.exceptions.
UnknownFPSError
¶ Framerate was not specified and couldn’t be inferred otherwise.
-
exception
pysubs2.exceptions.
UnknownFileExtensionError
¶ File extension does not pertain to any known subtitle format.
-
exception
pysubs2.exceptions.
UnknownFormatIdentifierError
¶ Unknown subtitle format identifier (ie. string like
"srt"
).
pysubs2.formats
— subtitle format implementations¶
Note
This submodule contains pysubs2 internals. It’s mostly of interest if you’re looking to implement a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase
.
Split text into fragments with computed SSAStyles.
Returns list of tuples (fragment, style), where fragment is a part of text between two brace-delimited override sequences, and style is the computed styling of the fragment, ie. the original style modified by all override sequences before the fragment.
Newline and non-breakable space overrides are left as-is.
Supported override tags:
i, b, u, s
r (with or without style name)
-
pysubs2.formats.
FILE_EXTENSION_TO_FORMAT_IDENTIFIER
: Dict[str, str] = {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd', '.txt': 'tmp', '.vtt': 'vtt'}¶ Dict mapping file extensions to format identifiers.
-
pysubs2.formats.
FORMAT_IDENTIFIER_TO_FORMAT_CLASS
: Dict[str, Type[pysubs2.formatbase.FormatBase]] = {'ass': <class 'pysubs2.substation.SubstationFormat'>, 'json': <class 'pysubs2.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.substation.SubstationFormat'>, 'tmp': <class 'pysubs2.tmp.TmpFormat'>, 'vtt': <class 'pysubs2.webvtt.WebVTTFormat'>}¶ Dict mapping format identifiers to implementations (FormatBase subclasses).
-
pysubs2.formats.
autodetect_format
(content: str) → str¶ Return format identifier for given fragment or raise FormatAutodetectionError.
-
pysubs2.formats.
get_file_extension
(format_: str) → str¶ Format identifier -> file extension
-
pysubs2.formats.
get_format_class
(format_: str) → Type[pysubs2.formatbase.FormatBase]¶ Format identifier -> format class (ie. subclass of FormatBase)
-
pysubs2.formats.
get_format_identifier
(ext: str) → str¶ File extension -> format identifier
-
class
pysubs2.formats.
FormatBase
¶ Base class for subtitle format implementations.
How to implement a new subtitle format:
Create a subclass of FormatBase and override the methods you want to support.
Decide on a format identifier, like the
"srt"
or"microdvd"
already used in the library.Add your identifier and class to
pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS
.(optional) Add your file extension and class to
pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER
.
After finishing these steps, you can call
SSAFile.load()
andSSAFile.save()
with your format, including autodetection from content and file extension (if you provided these).-
classmethod
from_file
(subs, fp: io.TextIOBase, format_: str, **kwargs)¶ Load subtitle file into an empty SSAFile.
If the parser autodetects framerate, set it as
subs.fps
.- Parameters
subs (SSAFile) – An empty
SSAFile
.fp (file object) – Text file object, the subtitle file.
format (str) – Format identifier. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns
None
- Raises
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and cannot be detected.
-
classmethod
guess_format
(text: str) → Optional[str]¶ Return format identifier of recognized format, or None.
- Parameters
text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters.
- Returns
format identifier (eg.
"srt"
) or None (unknown format)
-
classmethod
to_file
(subs, fp: io.TextIOBase, format_: str, **kwargs)¶ Write SSAFile into a file.
If you need framerate and it is not passed in keyword arguments, use
subs.fps
.- Parameters
subs (SSAFile) – Subtitle file to write.
fp (file object) – Text file object used as output.
format (str) – Format identifier of desired output format. Used when one format class implements multiple formats (see
SubstationFormat
).kwargs – Extra options, eg. fps.
- Returns
None
- Raises
pysubs2.exceptions.UnknownFPSError – Framerate was not provided and
subs.fps is None
.