API Reference¶
Note
The documentation is written from Python 3 point of view; a “string” means Unicode string.
Supported input/output formats¶
pysubs2 is built around SubStation Alpha, the native subtitle format of Aegisub.
SubStation Alpha — supported in two versions:
- .ass files (Advanced SubStation Alpha v4.0+), format identifier is
"ass"
. - .ssa files (SubStation Alpha v4.0), format identifier is
"ssa"
.
SubRip — .srt files, format identifier is "srt"
.
MicroDVD — .sub files, format identifier is "microdvd"
.
MPL2 — Time-based format similar to MicroDVD, format identifier is "mpl2"
. To save subtitles in MPL2 format,
use subs.save("subtitles.txt", format_="mpl2")
.
JSON-serialized internal representation, which amounts to ASS. Format identifier is "json"
.
pysubs2
— the main module¶
-
pysubs2.
load
= <bound method SSAFile.load of <class 'pysubs2.ssafile.SSAFile'>>¶ Alias for
SSAFile.load()
.
-
pysubs2.
make_time
(h=0, m=0, s=0, ms=0, frames=None, fps=None)¶ Alias for
pysubs2.time.make_time()
.
-
class
pysubs2.
Color
¶ (r, g, b, a) namedtuple for 8-bit RGB color with alpha channel.
All values are ints from 0 to 255.
SSAFile
— a subtitle file¶
-
class
pysubs2.
SSAFile
¶ Subtitle file in SubStation Alpha format.
This class has a list-like interface which exposes
SSAFile.events
, list of subtitles in the file:subs = SSAFile.load("subtitles.srt") for line in subs: print(line.text) subs.insert(0, SSAEvent(start=0, end=make_time(s=2.5), text="New first subtitle")) del subs[0]
-
aegisub_project
= None¶ Dict with Aegisub project, ie.
[Aegisub Project Garbage]
.
-
format
= None¶ Format of source subtitle file, if applicable, eg.
"srt"
.
-
fps
= None¶ Framerate used when reading the file, if applicable.
-
info
= None¶ Dict with script metadata, ie.
[Script Info]
.
-
Reading and writing subtitles¶
Using path to file¶
-
classmethod
SSAFile.
load
(path, encoding='utf-8', format_=None, fps=None, **kwargs)¶ Load subtitle file from given path.
Parameters: - path (str) – Path to subtitle file.
- encoding (str) – Character encoding of input file. Defaults to UTF-8, you may need to change this.
- format (str) – Optional, forces use of specific parser (eg. “srt”, “ass”). Otherwise, format is detected automatically from file contents. This argument should be rarely needed.
- fps (float) – Framerate for frame-based formats (MicroDVD), for other formats this argument is ignored. Framerate might be detected from the file, in which case you don’t need to specify it here (when given, this argument overrides autodetection).
- kwargs – Extra options for the parser.
Returns: SSAFile
Raises: Note
pysubs2 may autodetect subtitle format and/or framerate. These values are set as
SSAFile.format
andSSAFile.fps
attributes.Example
>>> subs1 = pysubs2.load("subrip-subtitles.srt") >>> subs2 = pysubs2.load("microdvd-subtitles.sub", fps=23.976)
-
SSAFile.
save
(path, encoding='utf-8', format_=None, fps=None, **kwargs)¶ Save subtitle file to given path.
Parameters: - path (str) – Path to subtitle file.
- encoding (str) – Character encoding of output file. Defaults to UTF-8, which should be fine for most purposes.
- format (str) – Optional, specifies desired subtitle format (eg. “srt”, “ass”). Otherwise, format is detected automatically from file extension. Thus, this argument is rarely needed.
- fps (float) – Framerate for frame-based formats (MicroDVD),
for other formats this argument is ignored. When omitted,
SSAFile.fps
value is used (ie. the framerate used for loading the file, if any). When theSSAFile
wasn’t loaded from MicroDVD, or if you wish save it with different framerate, use this argument. See alsoSSAFile.transform_framerate()
for fixing bad frame-based to time-based conversions. - kwargs – Extra options for the writer.
Raises:
Using string¶
-
classmethod
SSAFile.
from_string
(string, format_=None, fps=None, **kwargs)¶ Load subtitle file from string.
See
SSAFile.load()
for full description.Parameters: string (str) – Subtitle file in a string. Note that the string must be Unicode (in Python 2). Returns: SSAFile Example
>>> text = ''' ... 1 ... 00:00:00,000 --> 00:00:05,000 ... An example SubRip file. ... ''' >>> subs = SSAFile.from_string(text)
-
SSAFile.
to_string
(format_, fps=None, **kwargs)¶ Get subtitle file as a string.
See
SSAFile.save()
for full description.Returns: str
Using file object¶
-
classmethod
SSAFile.
from_file
(fp, format_=None, fps=None, **kwargs)¶ Read subtitle file from file object.
See
SSAFile.load()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.load()
orSSAFile.from_string()
is preferable.Parameters: fp (file object) – A file object, ie. io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).Returns: SSAFile
-
SSAFile.
to_file
(fp, format_, fps=None, **kwargs)¶ Write subtitle file to file object.
See
SSAFile.save()
for full description.Note
This is a low-level method. Usually, one of
SSAFile.save()
orSSAFile.to_string()
is preferable.Parameters: fp (file object) – A file object, ie. io.TextIOBase
instance. Note that the file must be opened in text mode (as opposed to binary).
Retiming subtitles¶
-
SSAFile.
shift
(h=0, m=0, s=0, ms=0, frames=None, fps=None)¶ Shift all subtitles by constant time amount.
Shift may be time-based (the default) or frame-based. In the latter case, specify both frames and fps. h, m, s, ms will be ignored.
Parameters: - m, s, ms (h,) – Integer or float values, may be positive or negative.
- frames (int) – When specified, must be an integer number of frames. May be positive or negative. fps must be also specified.
- fps (float) – When specified, must be a positive number.
Raises: ValueError
– Invalid fps or missing number of frames.
-
SSAFile.
transform_framerate
(in_fps, out_fps)¶ Rescale all timestamps by ratio of in_fps/out_fps.
Can be used to fix files converted from frame-based to time-based with wrongly assumed framerate.
Parameters: - in_fps (float) –
- out_fps (float) –
Raises: ValueError
– Non-positive framerate given.
Working with styles¶
-
SSAFile.
rename_style
(old_name, new_name)¶ Rename a style, including references to it.
Parameters: - old_name (str) – Style to be renamed.
- new_name (str) – New name for the style (must be unused).
Raises: KeyError
– No style named old_name.ValueError
– new_name is not a legal name (cannot use commas) or new_name is taken.
Misc methods¶
-
SSAFile.
equals
(other)¶ Equality of two SSAFiles.
Compares
SSAFile.info
,SSAFile.styles
andSSAFile.events
. Order of entries in OrderedDicts does not matter. “ScriptType” key in info is considered an implementation detail and thus ignored.Useful mostly in unit tests. Differences are logged at DEBUG level.
-
SSAFile.
sort
()¶ Sort subtitles time-wise, in-place.
SSAEvent
— one subtitle¶
-
class
pysubs2.
SSAEvent
(**fields)¶ A SubStation Event, ie. one subtitle.
In SubStation, each subtitle consists of multiple “fields” like Start, End and Text. These are exposed as attributes (note that they are lowercase; see
SSAEvent.FIELDS
for a list). Additionaly, there are some convenience properties likeSSAEvent.plaintext
orSSAEvent.duration
.This class defines an ordering with respect to (start, end) timestamps.
Tip
Use
pysubs2.make_time()
to get times in milliseconds.Example:
>>> ev = SSAEvent(start=make_time(s=1), end=make_time(s=2.5), text="Hello World!")
-
FIELDS
= frozenset({'layer', 'start', 'marginl', 'marginv', 'type', 'marginr', 'style', 'effect', 'text', 'end', 'name', 'marked'})¶ All fields in SSAEvent.
-
copy
()¶ Return a copy of the SSAEvent.
-
duration
¶ Subtitle duration in milliseconds (read/write property).
Writing to this property adjusts
SSAEvent.end
. Setting negative durations raisesValueError
.
-
effect
= None¶ Line effect
-
end
= None¶ Subtitle end time (in milliseconds)
-
equals
(other)¶ Field-based equality for SSAEvents.
-
is_comment
¶ When true, the subtitle is a comment, ie. not visible (read/write property).
Setting this property is equivalent to changing
SSAEvent.type
to"Dialogue"
or"Comment"
.
-
layer
= None¶ Layer number, 0 is the lowest layer (ASS only)
-
marginl
= None¶ Left margin
-
marginr
= None¶ Right margin
-
marginv
= None¶ Vertical margin
-
marked
= None¶ (SSA only)
-
name
= None¶ Actor name
-
plaintext
¶ Subtitle text as multi-line string with no tags (read/write property).
Writing to this property replaces
SSAEvent.text
with given plain text. Newlines are converted to\N
tags.
-
shift
(h=0, m=0, s=0, ms=0, frames=None, fps=None)¶ Shift start and end times.
See
SSAFile.shift()
for full description.
-
start
= None¶ Subtitle start time (in milliseconds)
-
style
= None¶ Style name
-
text
= None¶ Text of subtitle (with SubStation override tags)
-
type
= None¶ Line type (Dialogue/Comment)
-
SSAStyle
— a subtitle style¶
-
class
pysubs2.
SSAStyle
(**fields)¶ A SubStation Style.
In SubStation, each subtitle (
SSAEvent
) is associated with a style which defines its font, color, etc. Like a subtitle event, a style also consists of “fields”; seeSSAStyle.FIELDS
for a list (note the spelling, which is different from SubStation proper).Subtitles and styles are connected via an
SSAFile
they belong to.SSAEvent.style
is a string which is (or should be) a key in theSSAFile.styles
dict. Note that style name is stored separately; a givenSSAStyle
instance has no particular name itself.This class defines equality (equality of all fields).
-
FIELDS
= frozenset({'marginv', 'encoding', 'primarycolor', 'strikeout', 'borderstyle', 'alignment', 'alphalevel', 'marginl', 'italic', 'fontname', 'underline', 'secondarycolor', 'backcolor', 'outline', 'scaley', 'bold', 'angle', 'outlinecolor', 'spacing', 'marginr', 'scalex', 'fontsize', 'tertiarycolor', 'shadow'})¶ All fields in SSAStyle.
-
alignment
= None¶ Numpad-style alignment, eg. 7 is “top left” (that is, ASS alignment semantics)
-
alphalevel
= None¶ Old, unused SSA-only field
-
angle
= None¶ Rotation (ASS only)
-
backcolor
= None¶ Back, ie. shadow color (
pysubs2.Color
instance)
-
bold
= None¶ Bold
-
borderstyle
= None¶ Border style
-
encoding
= None¶ Charset
-
fontname
= None¶ Font name
-
fontsize
= None¶ Font size (in pixels)
-
italic
= None¶ Italic
-
marginl
= None¶ Left margin (in pixels)
-
marginr
= None¶ Right margin (in pixels)
-
marginv
= None¶ Vertical margin (in pixels)
-
outline
= None¶ Outline width (in pixels)
-
outlinecolor
= None¶ Outline color (
pysubs2.Color
instance)
-
primarycolor
= None¶ Primary color (
pysubs2.Color
instance)
-
scalex
= None¶ Horizontal scaling (ASS only)
-
scaley
= None¶ Vertical scaling (ASS only)
-
secondarycolor
= None¶ Secondary color (
pysubs2.Color
instance)
-
shadow
= None¶ Shadow depth (in pixels)
-
spacing
= None¶ Letter spacing (ASS only)
-
strikeout
= None¶ Strikeout (ASS only)
-
tertiarycolor
= None¶ Tertiary color (
pysubs2.Color
instance)
-
underline
= None¶ Underline (ASS only)
-
pysubs2.time
— time-related utilities¶
-
pysubs2.time.
TIMESTAMP
= re.compile('(\\d{1,2}):(\\d{2}):(\\d{2})[.,](\\d{2,3})')¶ Pattern that matches both SubStation and SubRip timestamps.
-
pysubs2.time.
frames_to_ms
(frames, fps)¶ Convert frame-based duration to milliseconds.
Parameters: - frames – Number of frames (should be int).
- fps – Framerate (must be a positive number, eg. 23.976).
Returns: Number of milliseconds (rounded to int).
Raises: ValueError
– fps was negative or zero.
-
pysubs2.time.
make_time
(h=0, m=0, s=0, ms=0, frames=None, fps=None)¶ Convert time to milliseconds.
See
pysubs2.time.times_to_ms()
. When both frames and fps are specified,pysubs2.time.frames_to_ms()
is called instead.Raises: ValueError
– Invalid fps, or one of frames/fps is missing.Example
>>> make_time(s=1.5) 1500 >>> make_time(frames=50, fps=25) 2000
-
pysubs2.time.
ms_to_frames
(ms, fps)¶ Convert milliseconds to number of frames.
Parameters: - ms – Number of milliseconds (may be int, float or other numeric class).
- fps – Framerate (must be a positive number, eg. 23.976).
Returns: Number of frames (int).
Raises: ValueError
– fps was negative or zero.
-
pysubs2.time.
ms_to_str
(ms, fractions=False)¶ Prettyprint milliseconds to [-]H:MM:SS[.mmm]
Handles huge and/or negative times. Non-negative times with
fractions=True
are matched bypysubs2.time.TIMESTAMP
.Parameters: - ms – Number of milliseconds (int, float or other numeric class).
- fractions – Whether to print up to millisecond precision.
Returns: str
-
pysubs2.time.
ms_to_times
(ms)¶ Convert milliseconds to normalized tuple (h, m, s, ms).
Parameters: ms – Number of milliseconds (may be int, float or other numeric class). Should be non-negative. Returns: Named tuple (h, m, s, ms) of ints. Invariants: ms in range(1000) and s in range(60) and m in range(60)
-
pysubs2.time.
times_to_ms
(h=0, m=0, s=0, ms=0)¶ Convert hours, minutes, seconds to milliseconds.
Arguments may be positive or negative, int or float, need not be normalized (
s=120
is okay).Returns: Number of milliseconds (rounded to int).
-
pysubs2.time.
timestamp_to_ms
(groups)¶ Convert groups from
pysubs2.time.TIMESTAMP
match to milliseconds.Example
>>> timestamp_to_ms(TIMESTAMP.match("0:00:00.42").groups()) 420
pysubs2.exceptions
— thrown exceptions¶
-
exception
pysubs2.exceptions.
FormatAutodetectionError
¶ Subtitle format is ambiguous or unknown.
-
exception
pysubs2.exceptions.
Pysubs2Error
¶ Base class for pysubs2 exceptions.
-
exception
pysubs2.exceptions.
UnknownFPSError
¶ Framerate was not specified and couldn’t be inferred otherwise.
-
exception
pysubs2.exceptions.
UnknownFileExtensionError
¶ File extension does not pertain to any known subtitle format.
-
exception
pysubs2.exceptions.
UnknownFormatIdentifierError
¶ Unknown subtitle format identifier (ie. string like
"srt"
).
pysubs2.formats
— subtitle format implementations¶
Note
This submodule contains pysubs2 internals. It’s mostly of interest if you’re looking to implement a subtitle format not supported by the library. In that case, have a look at pysubs2.formats.FormatBase
.
-
pysubs2.formats.
FILE_EXTENSION_TO_FORMAT_IDENTIFIER
= {'.ass': 'ass', '.json': 'json', '.srt': 'srt', '.ssa': 'ssa', '.sub': 'microdvd'}¶ Dict mapping file extensions to format identifiers.
-
pysubs2.formats.
FORMAT_IDENTIFIER_TO_FORMAT_CLASS
= {'ass': <class 'pysubs2.substation.SubstationFormat'>, 'json': <class 'pysubs2.jsonformat.JSONFormat'>, 'microdvd': <class 'pysubs2.microdvd.MicroDVDFormat'>, 'mpl2': <class 'pysubs2.mpl2.MPL2Format'>, 'srt': <class 'pysubs2.subrip.SubripFormat'>, 'ssa': <class 'pysubs2.substation.SubstationFormat'>}¶ Dict mapping format identifiers to implementations (FormatBase subclasses).
-
pysubs2.formats.
autodetect_format
(content)¶ Return format identifier for given fragment or raise FormatAutodetectionError.
-
pysubs2.formats.
get_file_extension
(format_)¶ Format identifier -> file extension
-
pysubs2.formats.
get_format_class
(format_)¶ Format identifier -> format class (ie. subclass of FormatBase)
-
pysubs2.formats.
get_format_identifier
(ext)¶ File extension -> format identifier
-
class
pysubs2.formats.
FormatBase
¶ Base class for subtitle format implementations.
How to implement a new subtitle format:
- Create a subclass of FormatBase and override the methods you want to support.
- Decide on a format identifier, like the
"srt"
or"microdvd"
already used in the library. - Add your identifier and class to
pysubs2.formats.FORMAT_IDENTIFIER_TO_FORMAT_CLASS
. - (optional) Add your file extension and class to
pysubs2.formats.FILE_EXTENSION_TO_FORMAT_IDENTIFIER
.
After finishing these steps, you can call
SSAFile.load()
andSSAFile.save()
with your format, including autodetection from content and file extension (if you provided these).-
classmethod
from_file
(subs, fp, format_, **kwargs)¶ Load subtitle file into an empty SSAFile.
If the parser autodetects framerate, set it as
subs.fps
.Parameters: - subs (SSAFile) – An empty
SSAFile
. - fp (file object) – Text file object, the subtitle file.
- format (str) – Format identifier. Used when one format class
implements multiple formats (see
SubstationFormat
). - kwargs – Extra options, eg. fps.
Returns: None
Raises: pysubs2.exceptions.UnknownFPSError
– Framerate was not provided and cannot be detected.- subs (SSAFile) – An empty
-
classmethod
guess_format
(text)¶ Return format identifier of recognized format, or None.
Parameters: text (str) – Content of subtitle file. When the file is long, this may be only its first few thousand characters. Returns: format identifier (eg. "srt"
) or None (unknown format)
-
classmethod
to_file
(subs, fp, format_, **kwargs)¶ Write SSAFile into a file.
If you need framerate and it is not passed in keyword arguments, use
subs.fps
.Parameters: - subs (SSAFile) – Subtitle file to write.
- fp (file object) – Text file object used as output.
- format (str) – Format identifier of desired output format.
Used when one format class implements multiple formats
(see
SubstationFormat
). - kwargs – Extra options, eg. fps.
Returns: None
Raises: pysubs2.exceptions.UnknownFPSError
– Framerate was not provided andsubs.fps is None
.